[jira] [Updated] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)
[ https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19996: - Attachment: HBASE-19996.master.001.patch > Some nonce procs might not be cleaned up (follow up HBASE-19756) > > > Key: HBASE-19996 > URL: https://issues.apache.org/jira/browse/HBASE-19996 > Project: HBase > Issue Type: Bug >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Major > Fix For: 1.3.2, 1.5.0, 1.4.2 > > Attachments: HBASE-19996.branch-1.4.001.patch, > HBASE-19996.branch-1.4.001.patch, HBASE-19996.master.001.patch > > > Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. > Unfortunately, the patch for branch-1 might not remove some valid procs too. > The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and > also adds another test to branch-2. Thanks to [~toffer] for flagging this > internally. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)
[ https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365060#comment-16365060 ] Thiruvel Thirumoolan commented on HBASE-19996: -- Test case patch for branch-2 for some reason is flaky when run multiple times, will check and then upload. > Some nonce procs might not be cleaned up (follow up HBASE-19756) > > > Key: HBASE-19996 > URL: https://issues.apache.org/jira/browse/HBASE-19996 > Project: HBase > Issue Type: Bug >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Major > Fix For: 1.5.0, 1.4.2 > > Attachments: HBASE-19996.branch-1.4.001.patch, > HBASE-19996.branch-1.4.001.patch > > > Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. > Unfortunately, the patch for branch-1 might not remove some valid procs too. > The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and > also adds another test to branch-2. Thanks to [~toffer] for flagging this > internally. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)
[ https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19996: - Description: Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. Unfortunately, the patch for branch-1 might not remove some valid procs too. The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and also adds another test to branch-2. Thanks to [~toffer] for flagging this internally. (was: Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. Unfortunately, the patch for branch-1 might not remove some valid procs too. The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and also adds another test to branch-2.) > Some nonce procs might not be cleaned up (follow up HBASE-19756) > > > Key: HBASE-19996 > URL: https://issues.apache.org/jira/browse/HBASE-19996 > Project: HBase > Issue Type: Bug >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Major > Fix For: 2.0.0, 1.3.2, 1.5.0, 1.4.2 > > Attachments: HBASE-19996.branch-1.4.001.patch > > > Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. > Unfortunately, the patch for branch-1 might not remove some valid procs too. > The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and > also adds another test to branch-2. Thanks to [~toffer] for flagging this > internally. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)
[ https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19996: - Status: Patch Available (was: Open) > Some nonce procs might not be cleaned up (follow up HBASE-19756) > > > Key: HBASE-19996 > URL: https://issues.apache.org/jira/browse/HBASE-19996 > Project: HBase > Issue Type: Bug >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Major > Fix For: 2.0.0, 1.3.2, 1.5.0, 1.4.2 > > Attachments: HBASE-19996.branch-1.4.001.patch > > > Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. > Unfortunately, the patch for branch-1 might not remove some valid procs too. > The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and > also adds another test to branch-2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)
[ https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19996: - Attachment: HBASE-19996.branch-1.4.001.patch > Some nonce procs might not be cleaned up (follow up HBASE-19756) > > > Key: HBASE-19996 > URL: https://issues.apache.org/jira/browse/HBASE-19996 > Project: HBase > Issue Type: Bug >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Major > Fix For: 2.0.0, 1.3.2, 1.5.0, 1.4.2 > > Attachments: HBASE-19996.branch-1.4.001.patch > > > Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. > Unfortunately, the patch for branch-1 might not remove some valid procs too. > The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and > also adds another test to branch-2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19756) Master NPE during completed failed proc eviction
[ https://issues.apache.org/jira/browse/HBASE-19756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363345#comment-16363345 ] Thiruvel Thirumoolan commented on HBASE-19756: -- [~apurtell]/[~yuzhih...@gmail.com] - The master patch here is fine, I wanted to rework on branch-1 patch, but fell sick and patch got committed within that. Raised HBASE-19996 as a followup to fix the problem with branch-1 patch. > Master NPE during completed failed proc eviction > > > Key: HBASE-19756 > URL: https://issues.apache.org/jira/browse/HBASE-19756 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.3.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Major > Fix For: 2.0.0, 3.0.0, 1.3.2, 1.4.1, 1.5.0 > > Attachments: HBASE-19756.branch-1.4.001.patch, > HBASE-19756.branch-1.4.002.patch, HBASE-19756.branch-1.4.003.patch, > HBASE-19756.master.001.patch > > > When procedures like Create table fails due to say AccessDeniedException, > then a rollback procedure is created. When the rollback is being cleaned up, > it results in an NPE because those nonce procs aren't persisted > Stack trace when this happens: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:385) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:547) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:504) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:453) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$CompletedProcedureCleaner.periodicExecute(ProcedureExecutor.java:184) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.timeoutLoop(ProcedureExecutor.java:995) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$500(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$3.run(ProcedureExecutor.java:507) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)
[ https://issues.apache.org/jira/browse/HBASE-19996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19996: - Fix Version/s: 1.4.2 1.5.0 1.3.2 2.0.0 > Some nonce procs might not be cleaned up (follow up HBASE-19756) > > > Key: HBASE-19996 > URL: https://issues.apache.org/jira/browse/HBASE-19996 > Project: HBase > Issue Type: Bug >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Major > Fix For: 2.0.0, 1.3.2, 1.5.0, 1.4.2 > > > Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. > Unfortunately, the patch for branch-1 might not remove some valid procs too. > The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and > also adds another test to branch-2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-19996) Some nonce procs might not be cleaned up (follow up HBASE-19756)
Thiruvel Thirumoolan created HBASE-19996: Summary: Some nonce procs might not be cleaned up (follow up HBASE-19756) Key: HBASE-19996 URL: https://issues.apache.org/jira/browse/HBASE-19996 Project: HBase Issue Type: Bug Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Follow up to HBASE-19756 which dealt with NPEs during proc cleanup. Unfortunately, the patch for branch-1 might not remove some valid procs too. The branch-2 patch doesn't have this problem. This fixes the branch-1 bug and also adds another test to branch-2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19756) Master NPE during completed failed proc eviction
[ https://issues.apache.org/jira/browse/HBASE-19756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324438#comment-16324438 ] Thiruvel Thirumoolan commented on HBASE-19756: -- Thanks for the review [~tedyu] 1. Latest patch precommit passed for 1.4 2. For master, the problem exists, but exception signature is different. The patch also is slightly different. Uploaded patch, lemme know how it looks. I am guessing my assumption that failed nonce procs don't have to be persisted is correct. > Master NPE during completed failed proc eviction > > > Key: HBASE-19756 > URL: https://issues.apache.org/jira/browse/HBASE-19756 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.3.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 1.3.2, 1.4.1 > > Attachments: HBASE-19756.branch-1.4.001.patch, > HBASE-19756.branch-1.4.002.patch, HBASE-19756.branch-1.4.003.patch, > HBASE-19756.master.001.patch > > > When procedures like Create table fails due to say AccessDeniedException, > then a rollback procedure is created. When the rollback is being cleaned up, > it results in an NPE because those nonce procs aren't persisted > Stack trace when this happens: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:385) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:547) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:504) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:453) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$CompletedProcedureCleaner.periodicExecute(ProcedureExecutor.java:184) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.timeoutLoop(ProcedureExecutor.java:995) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$500(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$3.run(ProcedureExecutor.java:507) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19756) Master NPE during completed failed proc eviction
[ https://issues.apache.org/jira/browse/HBASE-19756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19756: - Attachment: HBASE-19756.master.001.patch > Master NPE during completed failed proc eviction > > > Key: HBASE-19756 > URL: https://issues.apache.org/jira/browse/HBASE-19756 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.3.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 1.3.2, 1.4.1 > > Attachments: HBASE-19756.branch-1.4.001.patch, > HBASE-19756.branch-1.4.002.patch, HBASE-19756.branch-1.4.003.patch, > HBASE-19756.master.001.patch > > > When procedures like Create table fails due to say AccessDeniedException, > then a rollback procedure is created. When the rollback is being cleaned up, > it results in an NPE because those nonce procs aren't persisted > Stack trace when this happens: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:385) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:547) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:504) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:453) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$CompletedProcedureCleaner.periodicExecute(ProcedureExecutor.java:184) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.timeoutLoop(ProcedureExecutor.java:995) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$500(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$3.run(ProcedureExecutor.java:507) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19756) Master NPE during completed failed proc eviction
[ https://issues.apache.org/jira/browse/HBASE-19756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19756: - Attachment: HBASE-19756.branch-1.4.003.patch > Master NPE during completed failed proc eviction > > > Key: HBASE-19756 > URL: https://issues.apache.org/jira/browse/HBASE-19756 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.3.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 1.3.2, 1.4.1 > > Attachments: HBASE-19756.branch-1.4.001.patch, > HBASE-19756.branch-1.4.002.patch, HBASE-19756.branch-1.4.003.patch > > > When procedures like Create table fails due to say AccessDeniedException, > then a rollback procedure is created. When the rollback is being cleaned up, > it results in an NPE because those nonce procs aren't persisted > Stack trace when this happens: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:385) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:547) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:504) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:453) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$CompletedProcedureCleaner.periodicExecute(ProcedureExecutor.java:184) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.timeoutLoop(ProcedureExecutor.java:995) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$500(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$3.run(ProcedureExecutor.java:507) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19756) Master NPE during completed failed proc eviction
[ https://issues.apache.org/jira/browse/HBASE-19756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19756: - Attachment: HBASE-19756.branch-1.4.002.patch > Master NPE during completed failed proc eviction > > > Key: HBASE-19756 > URL: https://issues.apache.org/jira/browse/HBASE-19756 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.3.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 1.3.2, 1.4.1 > > Attachments: HBASE-19756.branch-1.4.001.patch, > HBASE-19756.branch-1.4.002.patch > > > When procedures like Create table fails due to say AccessDeniedException, > then a rollback procedure is created. When the rollback is being cleaned up, > it results in an NPE because those nonce procs aren't persisted > Stack trace when this happens: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:385) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:547) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:504) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:453) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$CompletedProcedureCleaner.periodicExecute(ProcedureExecutor.java:184) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.timeoutLoop(ProcedureExecutor.java:995) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$500(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$3.run(ProcedureExecutor.java:507) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19756) Master NPE during completed failed proc eviction
[ https://issues.apache.org/jira/browse/HBASE-19756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19756: - Fix Version/s: 1.4.1 1.3.2 Status: Patch Available (was: Open) > Master NPE during completed failed proc eviction > > > Key: HBASE-19756 > URL: https://issues.apache.org/jira/browse/HBASE-19756 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.4.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 1.3.2, 1.4.1 > > Attachments: HBASE-19756.branch-1.4.001.patch > > > When procedures like Create table fails due to say AccessDeniedException, > then a rollback procedure is created. When the rollback is being cleaned up, > it results in an NPE because those nonce procs aren't persisted > Stack trace when this happens: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:385) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:547) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:504) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:453) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$CompletedProcedureCleaner.periodicExecute(ProcedureExecutor.java:184) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.timeoutLoop(ProcedureExecutor.java:995) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$500(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$3.run(ProcedureExecutor.java:507) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19756) Master NPE during completed failed proc eviction
[ https://issues.apache.org/jira/browse/HBASE-19756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19756: - Attachment: HBASE-19756.branch-1.4.001.patch > Master NPE during completed failed proc eviction > > > Key: HBASE-19756 > URL: https://issues.apache.org/jira/browse/HBASE-19756 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.3.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: HBASE-19756.branch-1.4.001.patch > > > When procedures like Create table fails due to say AccessDeniedException, > then a rollback procedure is created. When the rollback is being cleaned up, > it results in an NPE because those nonce procs aren't persisted > Stack trace when this happens: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:385) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:547) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:504) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:453) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$CompletedProcedureCleaner.periodicExecute(ProcedureExecutor.java:184) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.timeoutLoop(ProcedureExecutor.java:995) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$500(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$3.run(ProcedureExecutor.java:507) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HBASE-19756) Master NPE during completed failed proc eviction
[ https://issues.apache.org/jira/browse/HBASE-19756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HBASE-19756: Assignee: Thiruvel Thirumoolan > Master NPE during completed failed proc eviction > > > Key: HBASE-19756 > URL: https://issues.apache.org/jira/browse/HBASE-19756 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.3.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > > When procedures like Create table fails due to say AccessDeniedException, > then a rollback procedure is created. When the rollback is being cleaned up, > it results in an NPE because those nonce procs aren't persisted > Stack trace when this happens: > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:385) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:547) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:504) > at > org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:453) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$CompletedProcedureCleaner.periodicExecute(ProcedureExecutor.java:184) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.timeoutLoop(ProcedureExecutor.java:995) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$500(ProcedureExecutor.java:78) > at > org.apache.hadoop.hbase.procedure2.ProcedureExecutor$3.run(ProcedureExecutor.java:507) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19756) Master NPE during completed failed proc eviction
Thiruvel Thirumoolan created HBASE-19756: Summary: Master NPE during completed failed proc eviction Key: HBASE-19756 URL: https://issues.apache.org/jira/browse/HBASE-19756 Project: HBase Issue Type: Bug Affects Versions: 1.3.1, 1.4.0 Reporter: Thiruvel Thirumoolan When procedures like Create table fails due to say AccessDeniedException, then a rollback procedure is created. When the rollback is being cleaned up, it results in an NPE because those nonce procs aren't persisted Stack trace when this happens: {noformat} java.lang.NullPointerException at org.apache.hadoop.hbase.procedure2.store.ProcedureStoreTracker.delete(ProcedureStoreTracker.java:385) at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.updateStoreTracker(WALProcedureStore.java:547) at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.pushData(WALProcedureStore.java:504) at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.delete(WALProcedureStore.java:453) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$CompletedProcedureCleaner.periodicExecute(ProcedureExecutor.java:184) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.timeoutLoop(ProcedureExecutor.java:995) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$500(ProcedureExecutor.java:78) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$3.run(ProcedureExecutor.java:507) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19468) FNFE during scans and flushes
[ https://issues.apache.org/jira/browse/HBASE-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297769#comment-16297769 ] Thiruvel Thirumoolan commented on HBASE-19468: -- [~ram_krish] +1 I also verified with a unit test that files are cleaned up even when scanner expires. I will run ITBLL with this patch sometime in next few days. I guess this patch can get in and not have to wait for that. Thanks Ram! > FNFE during scans and flushes > - > > Key: HBASE-19468 > URL: https://issues.apache.org/jira/browse/HBASE-19468 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: 1.3.1 >Reporter: Thiruvel Thirumoolan >Priority: Critical > Fix For: 2.0.0, 1.4.1, 1.5.0, 1.3.3 > > Attachments: HBASE-19468-poc.patch, HBASE-19468_1.4.patch, > HBASE-19468_master.patch > > > We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at > the same time. This causes regionserver to throw a UnknownScannerException > and client retries. > This happens during the following sequence: > 1. Scanner open, client fetched some rows from regionserver and working on it > 2. Flush happens and storeScanner is updated with flushed files > (StoreScanner.updateReaders()) > 3. Compaction happens on the region while scanner is still open > 4. compaction discharger runs and cleans up the newly flushed file as we > don't have new scanners on it yet. > 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we > get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. > With branch-1.4, the scan fails with a DoNotRetryIOException. > [~ram_krish], My proposal is to increment the reader count during > updateReaders() and decrement it during resetScannerStack(), so discharger > doesn't clean it up. Scan lease expiries also have to be taken care of. Am I > missing anything? Is there a better approach? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19468) FNFE during scans and flushes
[ https://issues.apache.org/jira/browse/HBASE-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16288664#comment-16288664 ] Thiruvel Thirumoolan commented on HBASE-19468: -- I didn't like the ref count approach to start with, but needed something simple to show the problem and demonstrate a fix. I wanted to rework on it. I prefer Ram's approach that doesn't touch counters directly, looks like both of us uploaded patch more or less same time and missed his. Gimme a couple of days, if its ok, just to cross check if anything else needs consideration. > FNFE during scans and flushes > - > > Key: HBASE-19468 > URL: https://issues.apache.org/jira/browse/HBASE-19468 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: 1.3.1 >Reporter: Thiruvel Thirumoolan >Priority: Critical > Fix For: 2.0.0, 1.4.1, 1.5.0, 1.3.3 > > Attachments: HBASE-19468-poc.patch, HBASE-19468_1.4.patch > > > We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at > the same time. This causes regionserver to throw a UnknownScannerException > and client retries. > This happens during the following sequence: > 1. Scanner open, client fetched some rows from regionserver and working on it > 2. Flush happens and storeScanner is updated with flushed files > (StoreScanner.updateReaders()) > 3. Compaction happens on the region while scanner is still open > 4. compaction discharger runs and cleans up the newly flushed file as we > don't have new scanners on it yet. > 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we > get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. > With branch-1.4, the scan fails with a DoNotRetryIOException. > [~ram_krish], My proposal is to increment the reader count during > updateReaders() and decrement it during resetScannerStack(), so discharger > doesn't clean it up. Scan lease expiries also have to be taken care of. Am I > missing anything? Is there a better approach? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HBASE-19468) FNFE during scans and flushes
[ https://issues.apache.org/jira/browse/HBASE-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HBASE-19468: Assignee: Thiruvel Thirumoolan > FNFE during scans and flushes > - > > Key: HBASE-19468 > URL: https://issues.apache.org/jira/browse/HBASE-19468 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: 1.3.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Critical > Fix For: 2.0.0, 1.4.1, 1.5.0, 1.3.3 > > Attachments: HBASE-19468-poc.patch, HBASE-19468_1.4.patch > > > We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at > the same time. This causes regionserver to throw a UnknownScannerException > and client retries. > This happens during the following sequence: > 1. Scanner open, client fetched some rows from regionserver and working on it > 2. Flush happens and storeScanner is updated with flushed files > (StoreScanner.updateReaders()) > 3. Compaction happens on the region while scanner is still open > 4. compaction discharger runs and cleans up the newly flushed file as we > don't have new scanners on it yet. > 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we > get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. > With branch-1.4, the scan fails with a DoNotRetryIOException. > [~ram_krish], My proposal is to increment the reader count during > updateReaders() and decrement it during resetScannerStack(), so discharger > doesn't clean it up. Scan lease expiries also have to be taken care of. Am I > missing anything? Is there a better approach? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HBASE-19468) FNFE during scans and flushes
[ https://issues.apache.org/jira/browse/HBASE-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HBASE-19468: Assignee: (was: Thiruvel Thirumoolan) > FNFE during scans and flushes > - > > Key: HBASE-19468 > URL: https://issues.apache.org/jira/browse/HBASE-19468 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: 1.3.1 >Reporter: Thiruvel Thirumoolan >Priority: Critical > Fix For: 2.0.0, 1.4.1, 1.5.0, 1.3.3 > > Attachments: HBASE-19468-poc.patch, HBASE-19468_1.4.patch > > > We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at > the same time. This causes regionserver to throw a UnknownScannerException > and client retries. > This happens during the following sequence: > 1. Scanner open, client fetched some rows from regionserver and working on it > 2. Flush happens and storeScanner is updated with flushed files > (StoreScanner.updateReaders()) > 3. Compaction happens on the region while scanner is still open > 4. compaction discharger runs and cleans up the newly flushed file as we > don't have new scanners on it yet. > 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we > get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. > With branch-1.4, the scan fails with a DoNotRetryIOException. > [~ram_krish], My proposal is to increment the reader count during > updateReaders() and decrement it during resetScannerStack(), so discharger > doesn't clean it up. Scan lease expiries also have to be taken care of. Am I > missing anything? Is there a better approach? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19468) FNFE during scans and flushes
[ https://issues.apache.org/jira/browse/HBASE-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19468: - Attachment: HBASE-19468-poc.patch [~ram_krish], Updating patch. In our case, it was either slow scans or the client was reading and was taking time to process each set of rows. We had the same problem for many of our users, so could be various reasons. > FNFE during scans and flushes > - > > Key: HBASE-19468 > URL: https://issues.apache.org/jira/browse/HBASE-19468 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: 1.3.1 >Reporter: Thiruvel Thirumoolan >Priority: Critical > Fix For: 2.0.0, 1.4.1, 1.5.0, 1.3.3 > > Attachments: HBASE-19468-poc.patch, HBASE-19468_1.4.patch > > > We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at > the same time. This causes regionserver to throw a UnknownScannerException > and client retries. > This happens during the following sequence: > 1. Scanner open, client fetched some rows from regionserver and working on it > 2. Flush happens and storeScanner is updated with flushed files > (StoreScanner.updateReaders()) > 3. Compaction happens on the region while scanner is still open > 4. compaction discharger runs and cleans up the newly flushed file as we > don't have new scanners on it yet. > 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we > get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. > With branch-1.4, the scan fails with a DoNotRetryIOException. > [~ram_krish], My proposal is to increment the reader count during > updateReaders() and decrement it during resetScannerStack(), so discharger > doesn't clean it up. Scan lease expiries also have to be taken care of. Am I > missing anything? Is there a better approach? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19468) FNFE during scans and flushes
[ https://issues.apache.org/jira/browse/HBASE-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19468: - Attachment: (was: HBASE-19468-poc1.patch) > FNFE during scans and flushes > - > > Key: HBASE-19468 > URL: https://issues.apache.org/jira/browse/HBASE-19468 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: 1.3.1 >Reporter: Thiruvel Thirumoolan >Priority: Critical > Fix For: 2.0.0, 1.4.1, 1.5.0, 1.3.3 > > > We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at > the same time. This causes regionserver to throw a UnknownScannerException > and client retries. > This happens during the following sequence: > 1. Scanner open, client fetched some rows from regionserver and working on it > 2. Flush happens and storeScanner is updated with flushed files > (StoreScanner.updateReaders()) > 3. Compaction happens on the region while scanner is still open > 4. compaction discharger runs and cleans up the newly flushed file as we > don't have new scanners on it yet. > 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we > get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. > With branch-1.4, the scan fails with a DoNotRetryIOException. > [~ram_krish], My proposal is to increment the reader count during > updateReaders() and decrement it during resetScannerStack(), so discharger > doesn't clean it up. Scan lease expiries also have to be taken care of. Am I > missing anything? Is there a better approach? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-17425) Fix calls to deprecated APIs in TestUpdateConfiguration
[ https://issues.apache.org/jira/browse/HBASE-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286840#comment-16286840 ] Thiruvel Thirumoolan commented on HBASE-17425: -- Thanks [~Jan Hentschel], branch-1.4 builds fine now. > Fix calls to deprecated APIs in TestUpdateConfiguration > --- > > Key: HBASE-17425 > URL: https://issues.apache.org/jira/browse/HBASE-17425 > Project: HBase > Issue Type: Improvement > Components: Client >Reporter: Jan Hentschel >Assignee: Jan Hentschel >Priority: Trivial > Fix For: 3.0.0, 2.0.0-beta-1 > > Attachments: HBASE-17425.master.001.patch > > > Currently there are two calls to the deprecated method > {code:java}HBaseTestingUtil.getHBaseAdmin(){code} in > *TestUpdateConfiguration*. These calls should be changed to > {code:java}HBaseTestingUtil.getAdmin(){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-19468) FNFE during scans and flushes
[ https://issues.apache.org/jira/browse/HBASE-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286553#comment-16286553 ] Thiruvel Thirumoolan commented on HBASE-19468: -- [~ram_krish] - Thanks for taking time to look into this. timezone has a big impact :) Compaction happens in this case. I thought mentioning compaction discharger was good enough, but I should have been clearer. I have updated the description now. I wanted to post a unit test and a prelim patch on Fri, but ran into another FNFE during region opening + small compaction (will raise another one once I narrow it down). We (Y!) have 1.3 on some of our clusters (the less loaded clusters) and FNFE happens atleast 2-3 times a day on all of them. Some of the issues in the umbrella jira HBASE-18397 helped. I will raise new ones for whatever we find. > FNFE during scans and flushes > - > > Key: HBASE-19468 > URL: https://issues.apache.org/jira/browse/HBASE-19468 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: 1.3.1 >Reporter: Thiruvel Thirumoolan >Priority: Critical > Fix For: 2.0.0, 1.4.1, 1.5.0, 1.3.3 > > > We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at > the same time. This causes regionserver to throw a UnknownScannerException > and client retries. > This happens during the following sequence: > 1. Scanner open, client fetched some rows from regionserver and working on it > 2. Flush happens and storeScanner is updated with flushed files > (StoreScanner.updateReaders()) > 3. Compaction happens on the region while scanner is still open > 4. compaction discharger runs and cleans up the newly flushed file as we > don't have new scanners on it yet. > 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we > get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. > With branch-1.4, the scan fails with a DoNotRetryIOException. > [~ram_krish], My proposal is to increment the reader count during > updateReaders() and decrement it during resetScannerStack(), so discharger > doesn't clean it up. Scan lease expiries also have to be taken care of. Am I > missing anything? Is there a better approach? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19468) FNFE during scans and flushes
[ https://issues.apache.org/jira/browse/HBASE-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19468: - Attachment: HBASE-19468-poc1.patch Uploading a POC patch, just to show the failures and a tentative fix, Needs more work though. > FNFE during scans and flushes > - > > Key: HBASE-19468 > URL: https://issues.apache.org/jira/browse/HBASE-19468 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: 1.3.1 >Reporter: Thiruvel Thirumoolan >Priority: Critical > Fix For: 2.0.0, 1.4.1, 1.5.0, 1.3.3 > > Attachments: HBASE-19468-poc1.patch > > > We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at > the same time. This causes regionserver to throw a UnknownScannerException > and client retries. > This happens during the following sequence: > 1. Scanner open, client fetched some rows from regionserver and working on it > 2. Flush happens and storeScanner is updated with flushed files > (StoreScanner.updateReaders()) > 3. Compaction happens on the region while scanner is still open > 4. compaction discharger runs and cleans up the newly flushed file as we > don't have new scanners on it yet. > 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we > get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. > With branch-1.4, the scan fails with a DoNotRetryIOException. > [~ram_krish], My proposal is to increment the reader count during > updateReaders() and decrement it during resetScannerStack(), so discharger > doesn't clean it up. Scan lease expiries also have to be taken care of. Am I > missing anything? Is there a better approach? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (HBASE-17425) Fix calls to deprecated APIs in TestUpdateConfiguration
[ https://issues.apache.org/jira/browse/HBASE-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reopened HBASE-17425: -- [~Jan Hentschel], Looks like HbaseTestingUtil.getAdmin() API is only in 2.x and should not be pushed to any of 1.x branches? Can you please revert this change? All branch-1 builds are failing because of this patch. I can't do a fresh checkout of branch-1.3 or branch-1.4 and build. See https://builds.apache.org/job/HBase-1.3-IT/it.test=IntegrationTestAcidGuarantees,jdk=JDK%201.8%20(latest),label=Hadoop/315/console {noformat} [INFO] BUILD FAILURE [INFO] [INFO] Total time: 51.778s [INFO] Finished at: Sat Dec 09 13:47:21 UTC 2017 [INFO] Final Memory: 151M/3543M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.5.1:testCompile (default-testCompile) on project hbase-server: Compilation failure: Compilation failure: [ERROR] warning: unknown enum constant When.UNKNOWN [ERROR] reason: class file for javax.annotation.meta.When not found [ERROR] /home/jenkins/jenkins-slave/workspace/HBase-1.3-IT/0407dd4a/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestUpdateConfiguration.java:[52,27] error: cannot find symbol [ERROR] symbol: method getAdmin() [ERROR] location: variable TEST_UTIL of type HBaseTestingUtility [ERROR] /home/jenkins/jenkins-slave/workspace/HBase-1.3-IT/0407dd4a/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestUpdateConfiguration.java:[68,27] error: cannot find symbol [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hbase-server {noformat} cc [~apurtell] in case you have a problem building 1.4. > Fix calls to deprecated APIs in TestUpdateConfiguration > --- > > Key: HBASE-17425 > URL: https://issues.apache.org/jira/browse/HBASE-17425 > Project: HBase > Issue Type: Improvement > Components: Client >Reporter: Jan Hentschel >Assignee: Jan Hentschel >Priority: Trivial > Fix For: 3.0.0, 1.3.2, 1.4.1, 1.5.0, 1.2.7, 2.0.0-beta-1, 1.1.13 > > Attachments: HBASE-17425.master.001.patch > > > Currently there are two calls to the deprecated method > {code:java}HBaseTestingUtil.getHBaseAdmin(){code} in > *TestUpdateConfiguration*. These calls should be changed to > {code:java}HBaseTestingUtil.getAdmin(){code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-19468) FNFE during scans and flushes
[ https://issues.apache.org/jira/browse/HBASE-19468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-19468: - Description: We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at the same time. This causes regionserver to throw a UnknownScannerException and client retries. This happens during the following sequence: 1. Scanner open, client fetched some rows from regionserver and working on it 2. Flush happens and storeScanner is updated with flushed files (StoreScanner.updateReaders()) 3. Compaction happens on the region while scanner is still open 4. compaction discharger runs and cleans up the newly flushed file as we don't have new scanners on it yet. 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. With branch-1.4, the scan fails with a DoNotRetryIOException. [~ram_krish], My proposal is to increment the reader count during updateReaders() and decrement it during resetScannerStack(), so discharger doesn't clean it up. Scan lease expiries also have to be taken care of. Am I missing anything? Is there a better approach? was: We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at the same time. This causes regionserver to throw a UnknownScannerException and client retries. This happens during the following sequence: 1. Scanner open, client fetched some rows from regionserver and working on it 2. Flush happens and storeScanner is updated with flushed files (StoreScanner.updateReaders()) 3. Compaction discharger runs and cleans up the newly flushed file as we don't have new scanners on it yet. 4. Client issues scan.next and during StoreScanner.resetScannerStack(), we get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. With branch-1.4, the scan fails with a DoNotRetryIOException. [~ram_krish], My proposal is to increment the reader count during updateReaders() and decrement it during resetScannerStack(), so discharger doesn't clean it up. Scan lease expiries also have to be taken care of. Am I missing anything? Is there a better approach? > FNFE during scans and flushes > - > > Key: HBASE-19468 > URL: https://issues.apache.org/jira/browse/HBASE-19468 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Affects Versions: 1.3.1 >Reporter: Thiruvel Thirumoolan >Priority: Critical > Fix For: 2.0.0, 1.4.1, 1.5.0, 1.3.3 > > > We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at > the same time. This causes regionserver to throw a UnknownScannerException > and client retries. > This happens during the following sequence: > 1. Scanner open, client fetched some rows from regionserver and working on it > 2. Flush happens and storeScanner is updated with flushed files > (StoreScanner.updateReaders()) > 3. Compaction happens on the region while scanner is still open > 4. compaction discharger runs and cleans up the newly flushed file as we > don't have new scanners on it yet. > 5. Client issues scan.next and during StoreScanner.resetScannerStack(), we > get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. > With branch-1.4, the scan fails with a DoNotRetryIOException. > [~ram_krish], My proposal is to increment the reader count during > updateReaders() and decrement it during resetScannerStack(), so discharger > doesn't clean it up. Scan lease expiries also have to be taken care of. Am I > missing anything? Is there a better approach? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-19468) FNFE during scans and flushes
Thiruvel Thirumoolan created HBASE-19468: Summary: FNFE during scans and flushes Key: HBASE-19468 URL: https://issues.apache.org/jira/browse/HBASE-19468 Project: HBase Issue Type: Sub-task Affects Versions: 1.3.1 Reporter: Thiruvel Thirumoolan Priority: Minor We see FNFE exceptions on our 1.3 clusters when scans and flushes happen at the same time. This causes regionserver to throw a UnknownScannerException and client retries. This happens during the following sequence: 1. Scanner open, client fetched some rows from regionserver and working on it 2. Flush happens and storeScanner is updated with flushed files (StoreScanner.updateReaders()) 3. Compaction discharger runs and cleans up the newly flushed file as we don't have new scanners on it yet. 4. Client issues scan.next and during StoreScanner.resetScannerStack(), we get a FNFE. RegionServer throws a UnknownScannerThe client retries in 1.3. With branch-1.4, the scan fails with a DoNotRetryIOException. [~ram_krish], My proposal is to increment the reader count during updateReaders() and decrement it during resetScannerStack(), so discharger doesn't clean it up. Scan lease expiries also have to be taken care of. Am I missing anything? Is there a better approach? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18349) Enable disabled tests in TestFavoredStochasticLoadBalancer that were disabled by Proc-V2 AM in HBASE-14614
[ https://issues.apache.org/jira/browse/HBASE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183365#comment-16183365 ] Thiruvel Thirumoolan commented on HBASE-18349: -- Was on vacation for a while, will get back to this one. > Enable disabled tests in TestFavoredStochasticLoadBalancer that were disabled > by Proc-V2 AM in HBASE-14614 > -- > > Key: HBASE-18349 > URL: https://issues.apache.org/jira/browse/HBASE-18349 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0-alpha-1 >Reporter: Stephen Yuan Jiang >Assignee: Thiruvel Thirumoolan > > The following 3 tests in TestFavoredStochasticLoadBalancerwere disabled by > HBASE-14614 (Core Proc-V2 AM): > - testAllFavoredNodesDead > - testAllFavoredNodesDeadMasterRestarted > - testMisplacedRegions > This JIRA is tracking necessary work to re-able (or remove/change if not > applicable) these UTs -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18356) Enable TestFavoredStochasticBalancerPickers#testPickers that was disabled by Proc-V2 AM in HBASE-14614
[ https://issues.apache.org/jira/browse/HBASE-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128190#comment-16128190 ] Thiruvel Thirumoolan commented on HBASE-18356: -- Unit test failures unrelated. The same patch applies to branch-2 as well, let me know if I should attach a separate patch for that. > Enable TestFavoredStochasticBalancerPickers#testPickers that was disabled by > Proc-V2 AM in HBASE-14614 > -- > > Key: HBASE-18356 > URL: https://issues.apache.org/jira/browse/HBASE-18356 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0-alpha-1 >Reporter: Stephen Yuan Jiang >Assignee: Thiruvel Thirumoolan > Attachments: HBASE-18356.master.001.patch > > > The testPickers in TestFavoredStochasticBalancerPickers hangs after applying > the change in Core Proc-V2 AM in HBASE-14614. It was disabled. > This JIRA tracks the work to enable it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18356) Enable TestFavoredStochasticBalancerPickers#testPickers that was disabled by Proc-V2 AM in HBASE-14614
[ https://issues.apache.org/jira/browse/HBASE-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-18356: - Status: Patch Available (was: Open) > Enable TestFavoredStochasticBalancerPickers#testPickers that was disabled by > Proc-V2 AM in HBASE-14614 > -- > > Key: HBASE-18356 > URL: https://issues.apache.org/jira/browse/HBASE-18356 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0-alpha-1 >Reporter: Stephen Yuan Jiang >Assignee: Thiruvel Thirumoolan > Attachments: HBASE-18356.master.001.patch > > > The testPickers in TestFavoredStochasticBalancerPickers hangs after applying > the change in Core Proc-V2 AM in HBASE-14614. It was disabled. > This JIRA tracks the work to enable it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18356) Enable TestFavoredStochasticBalancerPickers#testPickers that was disabled by Proc-V2 AM in HBASE-14614
[ https://issues.apache.org/jira/browse/HBASE-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-18356: - Attachment: HBASE-18356.master.001.patch > Enable TestFavoredStochasticBalancerPickers#testPickers that was disabled by > Proc-V2 AM in HBASE-14614 > -- > > Key: HBASE-18356 > URL: https://issues.apache.org/jira/browse/HBASE-18356 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0-alpha-1 >Reporter: Stephen Yuan Jiang >Assignee: Thiruvel Thirumoolan > Attachments: HBASE-18356.master.001.patch > > > The testPickers in TestFavoredStochasticBalancerPickers hangs after applying > the change in Core Proc-V2 AM in HBASE-14614. It was disabled. > This JIRA tracks the work to enable it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HBASE-18431) Mitigate compatibility concerns between branch-1.3 and branch-1.4
[ https://issues.apache.org/jira/browse/HBASE-18431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HBASE-18431: Assignee: Andrew Purtell (was: Thiruvel Thirumoolan) > Mitigate compatibility concerns between branch-1.3 and branch-1.4 > - > > Key: HBASE-18431 > URL: https://issues.apache.org/jira/browse/HBASE-18431 > Project: HBase > Issue Type: Bug >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Blocker > Fix For: 1.4.0, 1.5.0 > > Attachments: HBASE-18431-branch-1.4.patch, > HBASE-18431-branch-1.patch, HBASE-18431-branch-2-WIP.patch > > > There are compatibility concerns with branch-1.4. > {noformat} > Library Name HBase > Version #11.3.1 > Version #21.4.0-SNAPSHOT > Subject Binary Compatibility > Compatibility - 89.9% > Added Methods - 305 > Removed Methods - 105 > Problems with Data Types > High - 23 > Medium - 9 > Low - 21 > {noformat} > {noformat} > Library Name HBase > Version #11.3.1 > Version #21.4.0-SNAPSHOT > Subject Source Compatibility > Compatibility- 86.5% > Added Methods - 305 > Removed Methods - 105 > Problems with Data Types > High - 88 > Medium - 0 > Low - 0 > Other Changes in Data Types- 25 > {noformat} > This report includes HBASE-15816 which hasn't been committed yet. Otherwise > it's current. > I'm not generally concerned with added methods. > The following methods have been added to Public/Evolving interface Table. > Pointing them out in case it merits review. > \\ > * Abstract method Table.getReadRpcTimeout ( ) has been added to this > interface. No effect. > * Abstract method Table.getWriteRpcTimeout ( ) has been added to this > interface. No effect. > * Abstract method Table.setReadRpcTimeout ( int ) has been added to this > interface. No effect. > * Abstract method Table.setWriteRpcTimeout ( int ) has been added to this > interface. > The Public/Evolving interface Admin has some signature changes equating to > removed methods. I don't think this is allowed in a minor release. > \\ > * Abstract method Admin.isSnapshotFinished ( HBaseProtos.SnapshotDescription > ) has been removed from Admin. > * Abstract method Admin.snapshot ( String, TableName, > HBaseProtos.SnapshotDescription.Type ) has been removed from Admin. > * Abstract method Admin.snapshot ( HBaseProtos.SnapshotDescription ) has been > removed from Admin. > * Abstract method Admin.takeSnapshotAsync ( HBaseProtos.SnapshotDescription > ) has been removed from Admin. > The LimitedPrivate(CONFIG) interface AsyncRpcClient has been removed. This > change is debatable but I think we can allow it. > \\ > * AsyncRpcClient has been removed > The Public/Evolving class FastLongHistogram has been removed. I don't believe > this change is allowed in a minor release. > \\ > * FastLongHistogram has been removed > Method signatures in LimitedPrivate(COPROC) interfaces MasterObserver and > RegionObserver have changed, equating to removed methods. The first set of > changes is due to move of SnapshotDescription from HBaseProtos to > SnapshotProtos: > \\ > * Abstract method MasterObserver.postCloneSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver. > * Abstract method MasterObserver.postDeleteSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription ) has been removed from MasterObserver. > * Abstract method MasterObserver.postListSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription ) has been removed from MasterObserver. > * Abstract method MasterObserver.postRestoreSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver. > * Abstract method MasterObserver.postSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver. > * Abstract method MasterObserver.preCloneSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver. > * Abstract method MasterObserver.preDeleteSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription ) has been removed from MasterObserver. > * Abstract method MasterObserver.preListSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription ) has been removed from MasterObserver. > * Abstract method MasterObserver.preRestoreSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver. > * Abstract method MasterObserver.preSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver. > Here
[jira] [Assigned] (HBASE-18431) Mitigate compatibility concerns between branch-1.3 and branch-1.4
[ https://issues.apache.org/jira/browse/HBASE-18431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HBASE-18431: Assignee: Thiruvel Thirumoolan (was: Andrew Purtell) > Mitigate compatibility concerns between branch-1.3 and branch-1.4 > - > > Key: HBASE-18431 > URL: https://issues.apache.org/jira/browse/HBASE-18431 > Project: HBase > Issue Type: Bug >Reporter: Andrew Purtell >Assignee: Thiruvel Thirumoolan >Priority: Blocker > Fix For: 1.4.0, 1.5.0 > > Attachments: HBASE-18431-branch-1.4.patch, > HBASE-18431-branch-1.patch, HBASE-18431-branch-2-WIP.patch > > > There are compatibility concerns with branch-1.4. > {noformat} > Library Name HBase > Version #11.3.1 > Version #21.4.0-SNAPSHOT > Subject Binary Compatibility > Compatibility - 89.9% > Added Methods - 305 > Removed Methods - 105 > Problems with Data Types > High - 23 > Medium - 9 > Low - 21 > {noformat} > {noformat} > Library Name HBase > Version #11.3.1 > Version #21.4.0-SNAPSHOT > Subject Source Compatibility > Compatibility- 86.5% > Added Methods - 305 > Removed Methods - 105 > Problems with Data Types > High - 88 > Medium - 0 > Low - 0 > Other Changes in Data Types- 25 > {noformat} > This report includes HBASE-15816 which hasn't been committed yet. Otherwise > it's current. > I'm not generally concerned with added methods. > The following methods have been added to Public/Evolving interface Table. > Pointing them out in case it merits review. > \\ > * Abstract method Table.getReadRpcTimeout ( ) has been added to this > interface. No effect. > * Abstract method Table.getWriteRpcTimeout ( ) has been added to this > interface. No effect. > * Abstract method Table.setReadRpcTimeout ( int ) has been added to this > interface. No effect. > * Abstract method Table.setWriteRpcTimeout ( int ) has been added to this > interface. > The Public/Evolving interface Admin has some signature changes equating to > removed methods. I don't think this is allowed in a minor release. > \\ > * Abstract method Admin.isSnapshotFinished ( HBaseProtos.SnapshotDescription > ) has been removed from Admin. > * Abstract method Admin.snapshot ( String, TableName, > HBaseProtos.SnapshotDescription.Type ) has been removed from Admin. > * Abstract method Admin.snapshot ( HBaseProtos.SnapshotDescription ) has been > removed from Admin. > * Abstract method Admin.takeSnapshotAsync ( HBaseProtos.SnapshotDescription > ) has been removed from Admin. > The LimitedPrivate(CONFIG) interface AsyncRpcClient has been removed. This > change is debatable but I think we can allow it. > \\ > * AsyncRpcClient has been removed > The Public/Evolving class FastLongHistogram has been removed. I don't believe > this change is allowed in a minor release. > \\ > * FastLongHistogram has been removed > Method signatures in LimitedPrivate(COPROC) interfaces MasterObserver and > RegionObserver have changed, equating to removed methods. The first set of > changes is due to move of SnapshotDescription from HBaseProtos to > SnapshotProtos: > \\ > * Abstract method MasterObserver.postCloneSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver. > * Abstract method MasterObserver.postDeleteSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription ) has been removed from MasterObserver. > * Abstract method MasterObserver.postListSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription ) has been removed from MasterObserver. > * Abstract method MasterObserver.postRestoreSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver. > * Abstract method MasterObserver.postSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver. > * Abstract method MasterObserver.preCloneSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver. > * Abstract method MasterObserver.preDeleteSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription ) has been removed from MasterObserver. > * Abstract method MasterObserver.preListSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription ) has been removed from MasterObserver. > * Abstract method MasterObserver.preRestoreSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver. > * Abstract method MasterObserver.preSnapshot ( > ObserverContext, > HBaseProtos.SnapshotDescription, HTableDescriptor ) has been removed from > MasterObserver.
[jira] [Assigned] (HBASE-18349) Enable disabled tests in TestFavoredStochasticLoadBalancer that were disabled by Proc-V2 AM in HBASE-14614
[ https://issues.apache.org/jira/browse/HBASE-18349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HBASE-18349: Assignee: Thiruvel Thirumoolan > Enable disabled tests in TestFavoredStochasticLoadBalancer that were disabled > by Proc-V2 AM in HBASE-14614 > -- > > Key: HBASE-18349 > URL: https://issues.apache.org/jira/browse/HBASE-18349 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0-alpha-1 >Reporter: Stephen Yuan Jiang >Assignee: Thiruvel Thirumoolan > > The following 3 tests in TestFavoredStochasticLoadBalancerwere disabled by > HBASE-14614 (Core Proc-V2 AM): > - testAllFavoredNodesDead > - testAllFavoredNodesDeadMasterRestarted > - testMisplacedRegions > This JIRA is tracking necessary work to re-able (or remove/change if not > applicable) these UTs -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18356) Enable TestFavoredStochasticBalancerPickers#testPickers that was disabled by Proc-V2 AM in HBASE-14614
[ https://issues.apache.org/jira/browse/HBASE-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16080786#comment-16080786 ] Thiruvel Thirumoolan commented on HBASE-18356: -- Lemme give this a shot this week. > Enable TestFavoredStochasticBalancerPickers#testPickers that was disabled by > Proc-V2 AM in HBASE-14614 > -- > > Key: HBASE-18356 > URL: https://issues.apache.org/jira/browse/HBASE-18356 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0-alpha-1 >Reporter: Stephen Yuan Jiang >Assignee: Thiruvel Thirumoolan > > The testPickers in TestFavoredStochasticBalancerPickers hangs after applying > the change in Core Proc-V2 AM in HBASE-14614. It was disabled. > This JIRA tracks the work to enable it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HBASE-18356) Enable TestFavoredStochasticBalancerPickers#testPickers that was disabled by Proc-V2 AM in HBASE-14614
[ https://issues.apache.org/jira/browse/HBASE-18356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HBASE-18356: Assignee: Thiruvel Thirumoolan > Enable TestFavoredStochasticBalancerPickers#testPickers that was disabled by > Proc-V2 AM in HBASE-14614 > -- > > Key: HBASE-18356 > URL: https://issues.apache.org/jira/browse/HBASE-18356 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 2.0.0-alpha-1 >Reporter: Stephen Yuan Jiang >Assignee: Thiruvel Thirumoolan > > The testPickers in TestFavoredStochasticBalancerPickers hangs after applying > the change in Core Proc-V2 AM in HBASE-14614. It was disabled. > This JIRA tracks the work to enable it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18062) Admin API to removeFavoredNode on node decommission
[ https://issues.apache.org/jira/browse/HBASE-18062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16069072#comment-16069072 ] Thiruvel Thirumoolan commented on HBASE-18062: -- All favored admin APIs have a group and non-group version. Was planning to do both of them together. The group version of APIs would need FavoredRSGroupBalancer patch (HBASE-15533). That couldn't be pushed because we couldn't get the precommit builds to run (as rsgroups tests don't work, HBASE-18110). If enabling rsgroups tests would take a little bit of time, I can start with the non-group version of the APIs. > Admin API to removeFavoredNode on node decommission > --- > > Key: HBASE-18062 > URL: https://issues.apache.org/jira/browse/HBASE-18062 > Project: HBase > Issue Type: Sub-task >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > > From the design doc: > This command is useful while decommissioning a server. It removes the > specified server (hostname:port) as favored node from regions. This operation > does not change any assignments, but alters favored node information in > hbase:meta. The users have to make sure that the specified server is not > hosting any regions. Since region servers go down much more frequently than > datanodes, the admin has to use this tool when decommissioning a datanode and > region server. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18244) org.apache.hadoop.hbase.client.rsgroup.TestShellRSGroups hangs/fails
[ https://issues.apache.org/jira/browse/HBASE-18244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16056702#comment-16056702 ] Thiruvel Thirumoolan commented on HBASE-18244: -- [~elserj], The RSGroups tests were disabled as part of AMv2 changes, tracked at HBASE-18110. My observation is the same as yours. cc [~syuanjiang] > org.apache.hadoop.hbase.client.rsgroup.TestShellRSGroups hangs/fails > > > Key: HBASE-18244 > URL: https://issues.apache.org/jira/browse/HBASE-18244 > Project: HBase > Issue Type: Bug > Components: test >Reporter: Josh Elser > Fix For: 3.0.0 > > > Sometime in the past couple of weeks, TestShellRSGroups has started > timing-out/failing for me. > It will get stuck on a call to moveTables() > {noformat} > "main" #1 prio=5 os_prio=31 tid=0x7ff012004800 nid=0x1703 in > Object.wait() [0x7020d000] >java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:502) > at > org.apache.hadoop.hbase.ipc.BlockingRpcCallback.get(BlockingRpcCallback.java:62) > - locked <0x00078d1003f0> (a > org.apache.hadoop.hbase.ipc.BlockingRpcCallback) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:328) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$200(AbstractRpcClient.java:94) > at > org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:567) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$BlockingStub.execMasterService(MasterProtos.java) > at > org.apache.hadoop.hbase.client.ConnectionImplementation$3.execMasterService(ConnectionImplementation.java:1500) > at > org.apache.hadoop.hbase.client.HBaseAdmin$67$1.rpcCall(HBaseAdmin.java:2991) > at > org.apache.hadoop.hbase.client.HBaseAdmin$67$1.rpcCall(HBaseAdmin.java:2986) > at > org.apache.hadoop.hbase.client.MasterCallable.call(MasterCallable.java:98) > at > org.apache.hadoop.hbase.client.HBaseAdmin$67.callExecService(HBaseAdmin.java:2997) > at > org.apache.hadoop.hbase.client.SyncCoprocessorRpcChannel.callBlockingMethod(SyncCoprocessorRpcChannel.java:69) > at > org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService$BlockingStub.moveTables(RSGroupAdminProtos.java:13171) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminClient.moveTables(RSGroupAdminClient.java:117) > {noformat} > The server-side end of the RPC is waiting on a procedure to finish: > {noformat} > "RpcServer.default.FPBQ.Fifo.handler=27,queue=0,port=64242" #289 daemon > prio=5 os_prio=31 tid=0x7ff015b7c000 nid=0x1e603 waiting on condition > [0x7dbc9000] >java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:184) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitFor(ProcedureSyncWait.java:171) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToComplete(ProcedureSyncWait.java:141) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.waitForProcedureToCompleteIOE(ProcedureSyncWait.java:130) > at > org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait.submitAndWaitProcedure(ProcedureSyncWait.java:123) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.unassign(AssignmentManager.java:478) > at > org.apache.hadoop.hbase.master.assignment.AssignmentManager.unassign(AssignmentManager.java:465) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminServer.moveTables(RSGroupAdminServer.java:432) > at > org.apache.hadoop.hbase.rsgroup.RSGroupAdminEndpoint$RSGroupAdminServiceImpl.moveTables(RSGroupAdminEndpoint.java:174) > at > org.apache.hadoop.hbase.protobuf.generated.RSGroupAdminProtos$RSGroupAdminService.callMethod(RSGroupAdminProtos.java:12786) > at > org.apache.hadoop.hbase.master.MasterRpcServices.execMasterService(MasterRpcServices.java:673) > at > org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:278) > at > org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:258) >
[jira] [Commented] (HBASE-15533) Add RSGroup Favored Balancer
[ https://issues.apache.org/jira/browse/HBASE-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16022070#comment-16022070 ] Thiruvel Thirumoolan commented on HBASE-15533: -- Unit test failures are unrelated. > Add RSGroup Favored Balancer > > > Key: HBASE-15533 > URL: https://issues.apache.org/jira/browse/HBASE-15533 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Francis Liu >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-15533.master.001.patch, > HBASE-15533.master.002.patch, HBASE-15533.patch, HBASE-15533.rough.draft.patch > > > HBASE-16942 added favored stochastic load balancer so we can pick and choose > nodes to assign based on the favored nodes and load/locality. The intention > of this jira is to add a group based load balancer on top of the favored > stochastic balancer. This will ensure splits/merges will only use favored > nodes from that group and will inherit from the parents appropriately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15533) Add RSGroup Favored Balancer
[ https://issues.apache.org/jira/browse/HBASE-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-15533: - Attachment: HBASE-15533.master.002.patch > Add RSGroup Favored Balancer > > > Key: HBASE-15533 > URL: https://issues.apache.org/jira/browse/HBASE-15533 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Francis Liu >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-15533.master.001.patch, > HBASE-15533.master.002.patch, HBASE-15533.patch, HBASE-15533.rough.draft.patch > > > HBASE-16942 added favored stochastic load balancer so we can pick and choose > nodes to assign based on the favored nodes and load/locality. The intention > of this jira is to add a group based load balancer on top of the favored > stochastic balancer. This will ensure splits/merges will only use favored > nodes from that group and will inherit from the parents appropriately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18065) Admin API to do complete redistribute of favored nodes
Thiruvel Thirumoolan created HBASE-18065: Summary: Admin API to do complete redistribute of favored nodes Key: HBASE-18065 URL: https://issues.apache.org/jira/browse/HBASE-18065 Project: HBase Issue Type: Sub-task Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan The complete redistribute command/API creates new favored nodes for existing tables. This command is very invasive and disturbs both region assignment and data locality. All the regions get new set of favored nodes and are moved to the primary favored node. The favored nodes are generated in round robin fashion to ensure they are distributed evenly throughout the cluster. This command should only be used as a last resort when favored nodes should be regenerated and loss of locality is acceptable. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18064) Admin API to redistribute Favored Nodes
Thiruvel Thirumoolan created HBASE-18064: Summary: Admin API to redistribute Favored Nodes Key: HBASE-18064 URL: https://issues.apache.org/jira/browse/HBASE-18064 Project: HBase Issue Type: Sub-task Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan The redistribute command/API will spread favored nodes of regions across all online region servers. This should be useful when there are new machines added to the cluster. This makes sure that existing region assignment and data locality are not disturbed. The client is thin and the redistribution logic is done by the balancer, run as part of the Master. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18063) Admin API to check for favored nodes
Thiruvel Thirumoolan created HBASE-18063: Summary: Admin API to check for favored nodes Key: HBASE-18063 URL: https://issues.apache.org/jira/browse/HBASE-18063 Project: HBase Issue Type: Sub-task Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan >From design doc: This scans all the regions to see if all the favored nodes used by the regions are online and returns a list of dead servers being referenced. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18062) Admin API to removeFavoredNode on node decommission
Thiruvel Thirumoolan created HBASE-18062: Summary: Admin API to removeFavoredNode on node decommission Key: HBASE-18062 URL: https://issues.apache.org/jira/browse/HBASE-18062 Project: HBase Issue Type: Sub-task Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan >From the design doc: This command is useful while decommissioning a server. It removes the specified server (hostname:port) as favored node from regions. This operation does not change any assignments, but alters favored node information in hbase:meta. The users have to make sure that the specified server is not hosting any regions. Since region servers go down much more frequently than datanodes, the admin has to use this tool when decommissioning a datanode and region server. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17971) Follow up enhancements for Favored nodes pickers
Thiruvel Thirumoolan created HBASE-17971: Summary: Follow up enhancements for Favored nodes pickers Key: HBASE-17971 URL: https://issues.apache.org/jira/browse/HBASE-17971 Project: HBase Issue Type: Sub-task Components: FavoredNodes Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan [~toffer] suggested enhancements to the favored stochastic balancer on the review board for HBASE-16942. This is a follow up to address that. +[~devaraj]. https://reviews.apache.org/r/54724/#comment245802 {noformat} ideally this should be picking the server with the highest locality for the region. tho this will suffice for now the goal is just a bit different. prolly a future enhancement {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15533) Add RSGroup Favored Balancer
[ https://issues.apache.org/jira/browse/HBASE-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-15533: - Description: HBASE-16942 added favored stochastic load balancer so we can pick and choose nodes to assign based on the favored nodes and load/locality. The intention of this jira is to add a group based load balancer on top of the favored stochastic balancer. This will ensure splits/merges will only use favored nodes from that group and will inherit from the parents appropriately. > Add RSGroup Favored Balancer > > > Key: HBASE-15533 > URL: https://issues.apache.org/jira/browse/HBASE-15533 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Francis Liu >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-15533.master.001.patch, HBASE-15533.patch, > HBASE-15533.rough.draft.patch > > > HBASE-16942 added favored stochastic load balancer so we can pick and choose > nodes to assign based on the favored nodes and load/locality. The intention > of this jira is to add a group based load balancer on top of the favored > stochastic balancer. This will ensure splits/merges will only use favored > nodes from that group and will inherit from the parents appropriately. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-15533) Add RSGroup Favored Balancer
[ https://issues.apache.org/jira/browse/HBASE-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987365#comment-15987365 ] Thiruvel Thirumoolan commented on HBASE-15533: -- [~tedyu] - Its already posted on rb. I would like to take another look and see if I can clean up the tests/add more. > Add RSGroup Favored Balancer > > > Key: HBASE-15533 > URL: https://issues.apache.org/jira/browse/HBASE-15533 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Francis Liu >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-15533.master.001.patch, HBASE-15533.patch, > HBASE-15533.rough.draft.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15533) Add RSGroup Favored Balancer
[ https://issues.apache.org/jira/browse/HBASE-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-15533: - Status: Patch Available (was: Open) Submitting patch for precommit checks to run. > Add RSGroup Favored Balancer > > > Key: HBASE-15533 > URL: https://issues.apache.org/jira/browse/HBASE-15533 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Francis Liu >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-15533.master.001.patch, HBASE-15533.patch, > HBASE-15533.rough.draft.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15533) Add RSGroup Favored Balancer
[ https://issues.apache.org/jira/browse/HBASE-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-15533: - Attachment: HBASE-15533.master.001.patch > Add RSGroup Favored Balancer > > > Key: HBASE-15533 > URL: https://issues.apache.org/jira/browse/HBASE-15533 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Francis Liu >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-15533.master.001.patch, HBASE-15533.patch, > HBASE-15533.rough.draft.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985984#comment-15985984 ] Thiruvel Thirumoolan commented on HBASE-16942: -- Thanks [~syuanjiang], [~toffer] and [~tedyu] for the reviews! > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE-16942.master.008.patch, HBASE-16942.master.009.patch, > HBASE-16942.master.010.patch, HBASE-16942.master.011.patch, > HBASE-16942.master.012.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985828#comment-15985828 ] Thiruvel Thirumoolan commented on HBASE-16942: -- Unit test failures unrelated. > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE-16942.master.008.patch, HBASE-16942.master.009.patch, > HBASE-16942.master.010.patch, HBASE-16942.master.011.patch, > HBASE-16942.master.012.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Attachment: HBASE-16942.master.012.patch > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE-16942.master.008.patch, HBASE-16942.master.009.patch, > HBASE-16942.master.010.patch, HBASE-16942.master.011.patch, > HBASE-16942.master.012.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Attachment: HBASE-16942.master.011.patch > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE-16942.master.008.patch, HBASE-16942.master.009.patch, > HBASE-16942.master.010.patch, HBASE-16942.master.011.patch, > HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15952026#comment-15952026 ] Thiruvel Thirumoolan commented on HBASE-16942: -- Unit test failures unrelated to patch. > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE-16942.master.008.patch, HBASE-16942.master.009.patch, > HBASE-16942.master.010.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951792#comment-15951792 ] Thiruvel Thirumoolan commented on HBASE-16942: -- Rebased and uploaded patch, the unit tests ran fine on my laptop. https://reviews.apache.org/r/54724/diff/6-8/ - changes that describe since patch was last approved. Patch pending review from [~toffer]. If there are any new issues with precommit build, will address them and upload. > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE-16942.master.008.patch, HBASE-16942.master.009.patch, > HBASE-16942.master.010.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Attachment: HBASE-16942.master.010.patch > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE-16942.master.008.patch, HBASE-16942.master.009.patch, > HBASE-16942.master.010.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Attachment: HBASE-16942.master.009.patch > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE-16942.master.008.patch, HBASE-16942.master.009.patch, > HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15951596#comment-15951596 ] Thiruvel Thirumoolan commented on HBASE-16942: -- Will rebase and upload. > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE-16942.master.008.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15533) Add RSGroup Favored Balancer
[ https://issues.apache.org/jira/browse/HBASE-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-15533: - Attachment: HBASE-15533.patch Including FavoredRSGroup based balancer. This is built on top of FavoredStochasticBalancer from HBASE-16942. Once HBASE-16942 gets in, will submit this patch for precommit builds. > Add RSGroup Favored Balancer > > > Key: HBASE-15533 > URL: https://issues.apache.org/jira/browse/HBASE-15533 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Francis Liu >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-15533.patch, HBASE-15533.rough.draft.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15533) Add RSGroup Favored Balancer
[ https://issues.apache.org/jira/browse/HBASE-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-15533: - Summary: Add RSGroup Favored Balancer (was: RSGroup related favored nodes enhancements) > Add RSGroup Favored Balancer > > > Key: HBASE-15533 > URL: https://issues.apache.org/jira/browse/HBASE-15533 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Francis Liu >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-15533.rough.draft.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15902311#comment-15902311 ] Thiruvel Thirumoolan commented on HBASE-16942: -- Unit test failure org.apache.hadoop.hbase.TestAcidGuarantees unrelated to changes. > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE-16942.master.008.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Attachment: HBASE-16942.master.008.patch > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE-16942.master.008.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15886311#comment-15886311 ] Thiruvel Thirumoolan commented on HBASE-16942: -- Test failures are unrelated. > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Attachment: HBASE-16942.master.007.patch > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE-16942.master.007.patch, > HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Attachment: HBASE-16942.master.006.patch > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE-16942.master.006.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Attachment: HBASE-16942.master.005.patch > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE-16942.master.005.patch, > HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17684) Tools/API to read favored nodes for region(s)
Thiruvel Thirumoolan created HBASE-17684: Summary: Tools/API to read favored nodes for region(s) Key: HBASE-17684 URL: https://issues.apache.org/jira/browse/HBASE-17684 Project: HBase Issue Type: Sub-task Components: FavoredNodes Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan We need APIs to read FN from Master. This will help in troubleshooting when regions are in RIT due to all FN being dead etc. For small clusters, we could just read from SnapshotOfRegionAssignmentFromMeta, but for large clusters it takes 4-5 mins. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17685) Tools/Admin API to dump the replica load of server(s)
Thiruvel Thirumoolan created HBASE-17685: Summary: Tools/Admin API to dump the replica load of server(s) Key: HBASE-17685 URL: https://issues.apache.org/jira/browse/HBASE-17685 Project: HBase Issue Type: Sub-task Components: FavoredNodes Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan RPM has an option to dump the favored node distribution. We need an API to get the replica load from master. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17683) Admin API to update favored nodes in Master
Thiruvel Thirumoolan created HBASE-17683: Summary: Admin API to update favored nodes in Master Key: HBASE-17683 URL: https://issues.apache.org/jira/browse/HBASE-17683 Project: HBase Issue Type: Sub-task Components: FavoredNodes Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan For troubleshooting/decommissioning nodes/replacing nodes, we need an API to update the FN for a set of regions in Master. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) Add FavoredStochasticLoadBalancer and FN Candidate generators
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Summary: Add FavoredStochasticLoadBalancer and FN Candidate generators (was: FavoredNodes - Balancer improvements) > Add FavoredStochasticLoadBalancer and FN Candidate generators > - > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) FavoredNodes - Balancer improvements
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Attachment: HBASE-16942.master.004.patch > FavoredNodes - Balancer improvements > > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE-16942.master.004.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17626) Refactor HRegionServer to use regionNames as keys for FN
[ https://issues.apache.org/jira/browse/HBASE-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15861951#comment-15861951 ] Thiruvel Thirumoolan commented on HBASE-17626: -- Unit test failure is unrelated to the patch. > Refactor HRegionServer to use regionNames as keys for FN > > > Key: HBASE-17626 > URL: https://issues.apache.org/jira/browse/HBASE-17626 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-17626.master.001.patch > > > Similar to HBASE-16956, I would like to change HRegionServer and its FN cache > to use full region name instead of encoded name. In the worst case this avoid > clashes and just better to refactor this at the beginning rather than doing > it much later. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16943) FN Interfaces for writing tools for monitoring/operations
[ https://issues.apache.org/jira/browse/HBASE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16943: - Status: Patch Available (was: Open) > FN Interfaces for writing tools for monitoring/operations > - > > Key: HBASE-16943 > URL: https://issues.apache.org/jira/browse/HBASE-16943 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16943.master.001.patch > > > HBASE-15532 introduces new interfaces and tools. This sub-task is only for > the new interfaces and their implementation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16943) FN Interfaces for writing tools for monitoring/operations
[ https://issues.apache.org/jira/browse/HBASE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16943: - Attachment: HBASE-16943.master.001.patch > FN Interfaces for writing tools for monitoring/operations > - > > Key: HBASE-16943 > URL: https://issues.apache.org/jira/browse/HBASE-16943 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16943.master.001.patch > > > HBASE-15532 introduces new interfaces and tools. This sub-task is only for > the new interfaces and their implementation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17626) Refactor HRegionServer to use regionNames as keys for FN
[ https://issues.apache.org/jira/browse/HBASE-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17626: - Status: Patch Available (was: Open) > Refactor HRegionServer to use regionNames as keys for FN > > > Key: HBASE-17626 > URL: https://issues.apache.org/jira/browse/HBASE-17626 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-17626.master.001.patch > > > Similar to HBASE-16956, I would like to change HRegionServer and its FN cache > to use full region name instead of encoded name. In the worst case this avoid > clashes and just better to refactor this at the beginning rather than doing > it much later. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17626) Refactor HRegionServer to use regionNames as keys for FN
[ https://issues.apache.org/jira/browse/HBASE-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17626: - Attachment: HBASE-17626.master.001.patch > Refactor HRegionServer to use regionNames as keys for FN > > > Key: HBASE-17626 > URL: https://issues.apache.org/jira/browse/HBASE-17626 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-17626.master.001.patch > > > Similar to HBASE-16956, I would like to change HRegionServer and its FN cache > to use full region name instead of encoded name. In the worst case this avoid > clashes and just better to refactor this at the beginning rather than doing > it much later. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17626) Refactor HRegionServer to use regionNames as keys for FN
[ https://issues.apache.org/jira/browse/HBASE-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17626: - Component/s: FavoredNodes > Refactor HRegionServer to use regionNames as keys for FN > > > Key: HBASE-17626 > URL: https://issues.apache.org/jira/browse/HBASE-17626 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > > Similar to HBASE-16956, I would like to change HRegionServer and its FN cache > to use full region name instead of encoded name. In the worst case this avoid > clashes and just better to refactor this at the beginning rather than doing > it much later. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17626) Refactor HRegionServer to use regionNames as keys for FN
Thiruvel Thirumoolan created HBASE-17626: Summary: Refactor HRegionServer to use regionNames as keys for FN Key: HBASE-17626 URL: https://issues.apache.org/jira/browse/HBASE-17626 Project: HBase Issue Type: Sub-task Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Similar to HBASE-16956, I would like to change HRegionServer and its FN cache to use full region name instead of encoded name. In the worst case this avoid clashes and just better to refactor this at the beginning rather than doing it much later. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17620) Move table to another group (add -migrateTertiary)
[ https://issues.apache.org/jira/browse/HBASE-17620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15860180#comment-15860180 ] Thiruvel Thirumoolan commented on HBASE-17620: -- [~toffer]/[~devaraj] - What do you think? > Move table to another group (add -migrateTertiary) > -- > > Key: HBASE-17620 > URL: https://issues.apache.org/jira/browse/HBASE-17620 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > > As part of the design document in HBASE-15531, we mentioned about an approach > to move tables to new group. First only one favored node would be moved to > the new group using something like "rpm -migrateTertiary" command (this will > be something else since RPM will be deprecated). Once enough locality has > builtup on the tertiary nodes, the table can be moved to the group. > In my experience, the likelihood of tables moving across groups is rare and > there is a brief amount of time when one of the FN will belong to another > group and stuff like that. When regions split, we also have to consider this > situation and generate one FN from the target (or tertiary's) group. > Is this feature required? Do we need this as a start? I can attach tentative > patches and we could reconsider this in future if we don't need this as a > start. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17620) Move table to another group (add -migrateTertiary)
Thiruvel Thirumoolan created HBASE-17620: Summary: Move table to another group (add -migrateTertiary) Key: HBASE-17620 URL: https://issues.apache.org/jira/browse/HBASE-17620 Project: HBase Issue Type: Sub-task Components: FavoredNodes Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan As part of the design document in HBASE-15531, we mentioned about an approach to move tables to new group. First only one favored node would be moved to the new group using something like "rpm -migrateTertiary" command (this will be something else since RPM will be deprecated). Once enough locality has builtup on the tertiary nodes, the table can be moved to the group. In my experience, the likelihood of tables moving across groups is rare and there is a brief amount of time when one of the FN will belong to another group and stuff like that. When regions split, we also have to consider this situation and generate one FN from the target (or tertiary's) group. Is this feature required? Do we need this as a start? I can attach tentative patches and we could reconsider this in future if we don't need this as a start. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) FavoredNodes - Balancer improvements
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Attachment: HBASE-16942.master.003.patch > FavoredNodes - Balancer improvements > > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE-16942.master.003.patch, > HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17244) Refactor StartcodeAgnosticServerName so it doesn't extend ServerName
[ https://issues.apache.org/jira/browse/HBASE-17244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17244: - Component/s: FavoredNodes > Refactor StartcodeAgnosticServerName so it doesn't extend ServerName > > > Key: HBASE-17244 > URL: https://issues.apache.org/jira/browse/HBASE-17244 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0 > > > Follow up jira to address [~toffer]'s review comments @ > https://reviews.apache.org/r/53242/. /cc [~devaraj]. > {quote} > Apart from sharing a lot of functionality as ServerName it seems there is no > need to be a subclass. In fact it is prolly not good design to do so as it > may cause unwanted mixups for the user. (ie both instances in the same > collection). Since this is functionally correct. We can address this in a > separate jira. Prolly something we should do before we pull in the next jira. > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17198) FN updates during region merge (follow up to Procedure v2 merge)
[ https://issues.apache.org/jira/browse/HBASE-17198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17198: - Component/s: FavoredNodes > FN updates during region merge (follow up to Procedure v2 merge) > > > Key: HBASE-17198 > URL: https://issues.apache.org/jira/browse/HBASE-17198 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-17198.master.001.patch > > > As mentioned in https://reviews.apache.org/r/53242/ (HBASE-16941), since the > procedure v2 merge changes are in development, there is a follow up > optimization/cleanup that can be done for favored nodes during merge. This > jira will be taken up once HBASE-16119 is complete. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17153) Add FavoredNode checks to hbck -checkFavoredNodes
[ https://issues.apache.org/jira/browse/HBASE-17153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17153: - Component/s: FavoredNodes > Add FavoredNode checks to hbck -checkFavoredNodes > - > > Key: HBASE-17153 > URL: https://issues.apache.org/jira/browse/HBASE-17153 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > > Sample checks and error conditions include: > 1. Is favored nodes is present for all regions? > 2. Are there any duplicates in favored nodes? > 3. Do we have the right number of favored nodes, 3 at the moment? > 4. Are all of them online or do we have dead favored nodes? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17386) Adding FN documentation to reference guide
[ https://issues.apache.org/jira/browse/HBASE-17386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17386: - Component/s: FavoredNodes > Adding FN documentation to reference guide > -- > > Key: HBASE-17386 > URL: https://issues.apache.org/jira/browse/HBASE-17386 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17299) Add integration tests for FavoredNodes feature
[ https://issues.apache.org/jira/browse/HBASE-17299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17299: - Component/s: FavoredNodes > Add integration tests for FavoredNodes feature > -- > > Key: HBASE-17299 > URL: https://issues.apache.org/jira/browse/HBASE-17299 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > > The tests will include write to tables and check if the store file's block > locations match the favored nodes for that region. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17107) FN info should be cleaned up on region/table cleanup
[ https://issues.apache.org/jira/browse/HBASE-17107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17107: - Component/s: FavoredNodes > FN info should be cleaned up on region/table cleanup > > > Key: HBASE-17107 > URL: https://issues.apache.org/jira/browse/HBASE-17107 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE_17107.draft.patch, HBASE-17107.master.001.patch > > > FN info should be cleaned up when table is deleted and when regions are GCed > (i.e. CatalogJanitor). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17195) Split/Merge - Update of FN info along with regionState change
[ https://issues.apache.org/jira/browse/HBASE-17195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17195: - Component/s: FavoredNodes > Split/Merge - Update of FN info along with regionState change > - > > Key: HBASE-17195 > URL: https://issues.apache.org/jira/browse/HBASE-17195 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0 > > > Follow up optimization to HBASE-16941 to reduce the number of edits to meta > during a split/merge. Once balancer changes are in (where lot of updates > happen) will consolidate FN updates to META and see if it can be done better. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16942) FavoredNodes - Balancer improvements
[ https://issues.apache.org/jira/browse/HBASE-16942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16942: - Component/s: FavoredNodes > FavoredNodes - Balancer improvements > > > Key: HBASE-16942 > URL: https://issues.apache.org/jira/browse/HBASE-16942 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16942.master.001.patch, > HBASE-16942.master.002.patch, HBASE_16942_rough_draft.patch > > > This deals with the balancer based enhancements to favored nodes patch as > discussed in HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16956) Refactor FavoredNodePlan to use regionNames as keys
[ https://issues.apache.org/jira/browse/HBASE-16956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16956: - Component/s: FavoredNodes > Refactor FavoredNodePlan to use regionNames as keys > --- > > Key: HBASE-16956 > URL: https://issues.apache.org/jira/browse/HBASE-16956 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16956.branch-1.001.patch, > HBASE-16956.master.001.patch, HBASE-16956.master.002.patch, > HBASE-16956.master.003.patch, HBASE-16956.master.004.patch, > HBASE-16956.master.005.patch, HBASE-16956.master.006.patch, > HBASE-16956.master.007.patch > > > We would like to rely on the FNPlan cache whether a region is offline or not. > Sticking to regionNames as keys makes that possible. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16943) FN Interfaces for writing tools for monitoring/operations
[ https://issues.apache.org/jira/browse/HBASE-16943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16943: - Component/s: FavoredNodes > FN Interfaces for writing tools for monitoring/operations > - > > Key: HBASE-16943 > URL: https://issues.apache.org/jira/browse/HBASE-16943 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > > HBASE-15532 introduces new interfaces and tools. This sub-task is only for > the new interfaces and their implementation. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17100) Implement Chore to sync FN info from Master to RegionServers
[ https://issues.apache.org/jira/browse/HBASE-17100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17100: - Component/s: FavoredNodes > Implement Chore to sync FN info from Master to RegionServers > > > Key: HBASE-17100 > URL: https://issues.apache.org/jira/browse/HBASE-17100 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE_17100_draft.patch, HBASE-17100.master.001.patch > > > Master will have a repair chore which will periodically sync fn information > from master to all the region servers. This will protect against rpc failures. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17101) FavoredNodes should not apply to system tables
[ https://issues.apache.org/jira/browse/HBASE-17101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17101: - Component/s: FavoredNodes > FavoredNodes should not apply to system tables > -- > > Key: HBASE-17101 > URL: https://issues.apache.org/jira/browse/HBASE-17101 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-17101.master.001.patch, > HBASE-17101.master.002.patch, HBASE-17101.master.003.patch, > HBASE-17101.master.004.patch, HBASE_17101_rough_draft.patch > > > As described in the doc (see HBASE-15531), we would like to start with user > tables for favored nodes. This task ensures FN does not apply to system > tables. > System tables are in memory and won't benefit from favored nodes. Since we > also maintain FN information for user regions in meta, it helps to keep > implementation simpler by ignoring system tables for the first iterations. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-16941) FavoredNodes - Split/Merge code paths
[ https://issues.apache.org/jira/browse/HBASE-16941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16941: - Component/s: FavoredNodes > FavoredNodes - Split/Merge code paths > - > > Key: HBASE-16941 > URL: https://issues.apache.org/jira/browse/HBASE-16941 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16941.master.001.patch, > HBASE-16941.master.002.patch, HBASE-16941.master.003.patch, > HBASE-16941.master.004.patch, HBASE-16941.master.005.patch, > HBASE-16941.master.006.patch, HBASE-16941.master.007.patch, > HBASE-16941.master.008.patch, HBASE-16941.master.009.patch, > HBASE-16941.master.010.patch, HBASE-16941.master.011.patch, > HBASE-16941.master.012.patch, HBASE-16941.master.013.patch, > HBASE-16941.master.014.patch > > > This jira is to deal with the split/merge logic discussed as part of > HBASE-15532. The design document can be seen at HBASE-15531. The specific > changes are: > Split and merged regions should inherit favored node information from parent > regions. For splits also include some randomness so even if there are > subsequent splits, the regions will be more or less distributed. For split, > we include 2 FN from the parent and generate one random node. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15533) RSGroup related favored nodes enhancements
[ https://issues.apache.org/jira/browse/HBASE-15533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-15533: - Component/s: FavoredNodes > RSGroup related favored nodes enhancements > -- > > Key: HBASE-15533 > URL: https://issues.apache.org/jira/browse/HBASE-15533 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Francis Liu >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-15533.rough.draft.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17281) FN should use datanode port from hdfs configuration
[ https://issues.apache.org/jira/browse/HBASE-17281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854777#comment-15854777 ] Thiruvel Thirumoolan commented on HBASE-17281: -- Thanks [~stack]! > FN should use datanode port from hdfs configuration > --- > > Key: HBASE-17281 > URL: https://issues.apache.org/jira/browse/HBASE-17281 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-17281.master.001.patch, > HBASE-17281.master.002.patch, HBASE-17281.master.003.patch, > HBASE-17281.master.004.patch, HBASE-17281.master.005.patch > > > Currently we use the ServerName port for providing favored node hints. We > should use the DN port from hdfs-site.xml instead to avoid warning messages > in region server logs. The warnings will be from this section of HDFS code, > it moves across classes. > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java#L1758 > {code} > private boolean[] getPinnings(DatanodeInfo[] nodes) { > if (favoredNodes == null) { > return null; > } else { > boolean[] pinnings = new boolean[nodes.length]; > HashSet favoredSet = new HashSet<>(Arrays.asList(favoredNodes)); > for (int i = 0; i < nodes.length; i++) { > pinnings[i] = favoredSet.remove(nodes[i].getXferAddrWithHostname()); > LOG.debug("{} was chosen by name node (favored={}).", > nodes[i].getXferAddrWithHostname(), pinnings[i]); > } > if (!favoredSet.isEmpty()) { > // There is one or more favored nodes that were not allocated. > LOG.warn("These favored nodes were specified but not chosen: " > + favoredSet + " Specified favored nodes: " > + Arrays.toString(favoredNodes)); > } > return pinnings; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-15532) core favored nodes enhancements
[ https://issues.apache.org/jira/browse/HBASE-15532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-15532: - Component/s: FavoredNodes > core favored nodes enhancements > --- > > Key: HBASE-15532 > URL: https://issues.apache.org/jira/browse/HBASE-15532 > Project: HBase > Issue Type: Sub-task > Components: FavoredNodes >Reporter: Francis Liu >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-15532.master.000.patch, > HBASE-15532.master.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17281) FN should use datanode port from hdfs configuration
[ https://issues.apache.org/jira/browse/HBASE-17281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15854752#comment-15854752 ] Thiruvel Thirumoolan commented on HBASE-17281: -- [~stack], Can you pls create a "favorednodes" component so I can tag all FN related jiras? Thanks! > FN should use datanode port from hdfs configuration > --- > > Key: HBASE-17281 > URL: https://issues.apache.org/jira/browse/HBASE-17281 > Project: HBase > Issue Type: Sub-task >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-17281.master.001.patch, > HBASE-17281.master.002.patch, HBASE-17281.master.003.patch, > HBASE-17281.master.004.patch, HBASE-17281.master.005.patch > > > Currently we use the ServerName port for providing favored node hints. We > should use the DN port from hdfs-site.xml instead to avoid warning messages > in region server logs. The warnings will be from this section of HDFS code, > it moves across classes. > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java#L1758 > {code} > private boolean[] getPinnings(DatanodeInfo[] nodes) { > if (favoredNodes == null) { > return null; > } else { > boolean[] pinnings = new boolean[nodes.length]; > HashSet favoredSet = new HashSet<>(Arrays.asList(favoredNodes)); > for (int i = 0; i < nodes.length; i++) { > pinnings[i] = favoredSet.remove(nodes[i].getXferAddrWithHostname()); > LOG.debug("{} was chosen by name node (favored={}).", > nodes[i].getXferAddrWithHostname(), pinnings[i]); > } > if (!favoredSet.isEmpty()) { > // There is one or more favored nodes that were not allocated. > LOG.warn("These favored nodes were specified but not chosen: " > + favoredSet + " Specified favored nodes: " > + Arrays.toString(favoredNodes)); > } > return pinnings; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17281) FN should use datanode port from hdfs configuration
[ https://issues.apache.org/jira/browse/HBASE-17281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-17281: - Attachment: HBASE-17281.master.005.patch > FN should use datanode port from hdfs configuration > --- > > Key: HBASE-17281 > URL: https://issues.apache.org/jira/browse/HBASE-17281 > Project: HBase > Issue Type: Sub-task >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-17281.master.001.patch, > HBASE-17281.master.002.patch, HBASE-17281.master.003.patch, > HBASE-17281.master.004.patch, HBASE-17281.master.005.patch > > > Currently we use the ServerName port for providing favored node hints. We > should use the DN port from hdfs-site.xml instead to avoid warning messages > in region server logs. The warnings will be from this section of HDFS code, > it moves across classes. > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java#L1758 > {code} > private boolean[] getPinnings(DatanodeInfo[] nodes) { > if (favoredNodes == null) { > return null; > } else { > boolean[] pinnings = new boolean[nodes.length]; > HashSet favoredSet = new HashSet<>(Arrays.asList(favoredNodes)); > for (int i = 0; i < nodes.length; i++) { > pinnings[i] = favoredSet.remove(nodes[i].getXferAddrWithHostname()); > LOG.debug("{} was chosen by name node (favored={}).", > nodes[i].getXferAddrWithHostname(), pinnings[i]); > } > if (!favoredSet.isEmpty()) { > // There is one or more favored nodes that were not allocated. > LOG.warn("These favored nodes were specified but not chosen: " > + favoredSet + " Specified favored nodes: " > + Arrays.toString(favoredNodes)); > } > return pinnings; > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)