[jira] [Updated] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table
[ https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-21388: --- Affects Version/s: 2.1.1 2.2.0 3.0.0 2.0.2 > No need to instantiate MemStoreLAB for master which not carry table > --- > > Key: HBASE-21388 > URL: https://issues.apache.org/jira/browse/HBASE-21388 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-21388.master.001.patch, > HBASE-21388.master.002.patch > > > We found this log in our master. > 2018-10-26,10:00:00,449 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data > MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0 > 2018-10-26,10:00:00,452 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index > MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0 > > Same with HBASE-21290, we don't need to instantiate MemStore for master which > not carry table. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21374) Backport HBASE-21342 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mazhenlin updated HBASE-21374: -- Attachment: HBASE-21374.branch-1.003.patch > Backport HBASE-21342 to branch-1 > > > Key: HBASE-21374 > URL: https://issues.apache.org/jira/browse/HBASE-21374 > Project: HBase > Issue Type: Task >Reporter: Mike Drob >Assignee: mazhenlin >Priority: Major > Attachments: HBASE-21374.branch-1.001.patch, > HBASE-21374.branch-1.002.patch, HBASE-21374.branch-1.003.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21374) Backport HBASE-21342 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1778#comment-1778 ] Hadoop QA commented on HBASE-21374: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 52s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 29s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 36s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 36s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 28s{color} | {color:red} hbase-server: The patch generated 1 new + 10 unchanged - 0 fixed = 11 total (was 10) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 38s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 2m 6s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}147m 48s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}179m 26s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.cleaner.TestHFileCleaner | | | hadoop.hbase.regionserver.TestSplitTransactionOnCluster | | | hadoop.hbase.replication.TestReplicationSmallTests | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 | | JIRA Issue | HBASE-21374 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12945978/HBASE-21374.branch-1.002.patch | | Optional T
[jira] [Updated] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table
[ https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-21388: --- Attachment: HBASE-21388.master.002.patch > No need to instantiate MemStoreLAB for master which not carry table > --- > > Key: HBASE-21388 > URL: https://issues.apache.org/jira/browse/HBASE-21388 > Project: HBase > Issue Type: Improvement >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-21388.master.001.patch, > HBASE-21388.master.002.patch > > > We found this log in our master. > 2018-10-26,10:00:00,449 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data > MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0 > 2018-10-26,10:00:00,452 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index > MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0 > > Same with HBASE-21290, we don't need to instantiate MemStore for master which > not carry table. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20952) Re-visit the WAL API
[ https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1744#comment-1744 ] Hudson commented on HBASE-20952: Results for branch HBASE-20952 [build #32 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/32/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/32//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/32//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/32//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Re-visit the WAL API > > > Key: HBASE-20952 > URL: https://issues.apache.org/jira/browse/HBASE-20952 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Josh Elser >Priority: Major > Attachments: 20952.v1.txt > > > Take a step back from the current WAL implementations and think about what an > HBase WAL API should look like. What are the primitive calls that we require > to guarantee durability of writes with a high degree of performance? > The API needs to take the current implementations into consideration. We > should also have a mind for what is happening in the Ratis LogService (but > the LogService should not dictate what HBase's WAL API looks like RATIS-272). > Other "systems" inside of HBase that use WALs are replication and > backup&restore. Replication has the use-case for "tail"'ing the WAL which we > should provide via our new API. B&R doesn't do anything fancy (IIRC). We > should make sure all consumers are generally going to be OK with the API we > create. > The API may be "OK" (or OK in a part). We need to also consider other methods > which were "bolted" on such as {{AbstractFSWAL}} and > {{WALFileLengthProvider}}. Other corners of "WAL use" (like the > {{WALSplitter}} should also be looked at to use WAL-APIs only). > We also need to make sure that adequate interface audience and stability > annotations are chosen. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on
[ https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-21395: --- Attachment: HBASE-21395.branch-2.0.004.patch > Abort split/merge procedure if there is a table procedure of the same table > going on > > > Key: HBASE-21395 > URL: https://issues.apache.org/jira/browse/HBASE-21395 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0, 2.0.2 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 2.1.2 > > Attachments: HBASE-21395.branch-2.0.001.patch, > HBASE-21395.branch-2.0.002.patch, HBASE-21395.branch-2.0.003.patch, > HBASE-21395.branch-2.0.004.patch > > > In my ITBLL, I often see that if split/merge procedure and table > procedure(like ModifyTableProcedure) happen at the same time, and since there > some race conditions between these two kind of procedures, causing some > serious problems. e.g. the split/merged parent is bought on line by the table > procedure or the split merged region making the whole table procedure > rollback. > Talked with [~Apache9] offline today, this kind of problem was solved in > branch-2+ since There is a fence that only one RTSP can agianst a single > region at the same time. > To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe > fence in the split/merge procedure: If there is a table procedure going on > against the same table, then abort the split/merge procedure. Aborting the > split/merge procedure at the beginning of the execution is no big deal, > compared with the mess it will cause... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)
[ https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1724#comment-1724 ] Jingyun Tian edited comment on HBASE-19121 at 10/29/18 4:05 AM: Sounds like we need to get regions of all problematic states for all tables to get a full list? I think add a tab to the navigator bar to dump the RIT as a table and can be viewed as txt could be easier to use? Maybe I can make a demo and then we can compare which is better. was (Author: tianjingyun): Sounds like we need to get regions of all problematic states for all tables to get a full list? I think add a tab to the navigator bar to dump the RIT as a table and can be viewed as txt could be easier to use? > HBCK for AMv2 (A.K.A HBCK2) > --- > > Key: HBASE-19121 > URL: https://issues.apache.org/jira/browse/HBASE-19121 > Project: HBase > Issue Type: Umbrella > Components: hbck, hbck2 >Reporter: stack >Assignee: Umesh Agashe >Priority: Major > Fix For: hbck2-1.0.0 > > Attachments: hbase-19121.master.001.patch > > > We don't have an hbck for the new AM. Old hbck may actually do damage going > against AMv2. > Fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)
[ https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1724#comment-1724 ] Jingyun Tian commented on HBASE-19121: -- Sounds like we need to get regions of all problematic states for all tables to get a full list? I think add a tab to the navigator bar to dump the RIT as a table and can be viewed as txt could be easier to use? > HBCK for AMv2 (A.K.A HBCK2) > --- > > Key: HBASE-19121 > URL: https://issues.apache.org/jira/browse/HBASE-19121 > Project: HBase > Issue Type: Umbrella > Components: hbck, hbck2 >Reporter: stack >Assignee: Umesh Agashe >Priority: Major > Fix For: hbck2-1.0.0 > > Attachments: hbase-19121.master.001.patch > > > We don't have an hbck for the new AM. Old hbck may actually do damage going > against AMv2. > Fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table
[ https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1712#comment-1712 ] Guanghao Zhang commented on HBASE-21388: The MemStoreLAB and ChunkCreator is global per process. The minicluster start master and rs in same process, the ut will failed as rs will create a new MemStoreLAB. Will find a way to resolve this later. > No need to instantiate MemStoreLAB for master which not carry table > --- > > Key: HBASE-21388 > URL: https://issues.apache.org/jira/browse/HBASE-21388 > Project: HBase > Issue Type: Improvement >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-21388.master.001.patch > > > We found this log in our master. > 2018-10-26,10:00:00,449 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data > MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0 > 2018-10-26,10:00:00,452 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index > MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0 > > Same with HBASE-21290, we don't need to instantiate MemStore for master which > not carry table. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table
[ https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-21388: --- Attachment: HBASE-21388.master.001.patch > No need to instantiate MemStoreLAB for master which not carry table > --- > > Key: HBASE-21388 > URL: https://issues.apache.org/jira/browse/HBASE-21388 > Project: HBase > Issue Type: Improvement >Reporter: Guanghao Zhang >Priority: Major > Attachments: HBASE-21388.master.001.patch > > > We found this log in our master. > 2018-10-26,10:00:00,449 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data > MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0 > 2018-10-26,10:00:00,452 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index > MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0 > > Same with HBASE-21290, we don't need to instantiate MemStore for master which > not carry table. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table
[ https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-21388: --- Status: Patch Available (was: Open) > No need to instantiate MemStoreLAB for master which not carry table > --- > > Key: HBASE-21388 > URL: https://issues.apache.org/jira/browse/HBASE-21388 > Project: HBase > Issue Type: Improvement >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-21388.master.001.patch > > > We found this log in our master. > 2018-10-26,10:00:00,449 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data > MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0 > 2018-10-26,10:00:00,452 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index > MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0 > > Same with HBASE-21290, we don't need to instantiate MemStore for master which > not carry table. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table
[ https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang reassigned HBASE-21388: -- Assignee: Guanghao Zhang > No need to instantiate MemStoreLAB for master which not carry table > --- > > Key: HBASE-21388 > URL: https://issues.apache.org/jira/browse/HBASE-21388 > Project: HBase > Issue Type: Improvement >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-21388.master.001.patch > > > We found this log in our master. > 2018-10-26,10:00:00,449 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data > MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0 > 2018-10-26,10:00:00,452 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index > MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0 > > Same with HBASE-21290, we don't need to instantiate MemStore for master which > not carry table. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21388) No need to instantiate MemStoreLAB for master which not carry table
[ https://issues.apache.org/jira/browse/HBASE-21388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-21388: --- Summary: No need to instantiate MemStoreLAB for master which not carry table (was: No need to instantiate MemStore for master which not carry table) > No need to instantiate MemStoreLAB for master which not carry table > --- > > Key: HBASE-21388 > URL: https://issues.apache.org/jira/browse/HBASE-21388 > Project: HBase > Issue Type: Improvement >Reporter: Guanghao Zhang >Priority: Major > > We found this log in our master. > 2018-10-26,10:00:00,449 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating data > MemStoreChunkPool with chunk size 2 MB, max count 737, initial count 0 > 2018-10-26,10:00:00,452 INFO > [master/c4-hadoop-tst-ct16:42900:becomeActiveMaster] > org.apache.hadoop.hbase.regionserver.ChunkCreator: Allocating index > MemStoreChunkPool with chunk size 204.80 KB, max count 819, initial count 0 > > Same with HBASE-21290, we don't need to instantiate MemStore for master which > not carry table. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)
[ https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1702#comment-1702 ] stack commented on HBASE-19121: --- Its as you state, if in UI, it is always available to the operator but yeah, if UI is not up, then operator is stuck. Perhaps we work on making sure UI is always available? I was thinking that operator could click on the UI in the tables panel on the OPENING count and get a page that listed all the regions in OPENING. Then same for OPEN, CLOSED, CLOSING? I made a start a while back but didn't get far. Would be useful for operator. Could use curl or wget or lynx to get the list. > HBCK for AMv2 (A.K.A HBCK2) > --- > > Key: HBASE-19121 > URL: https://issues.apache.org/jira/browse/HBASE-19121 > Project: HBase > Issue Type: Umbrella > Components: hbck, hbck2 >Reporter: stack >Assignee: Umesh Agashe >Priority: Major > Fix For: hbck2-1.0.0 > > Attachments: hbase-19121.master.001.patch > > > We don't have an hbck for the new AM. Old hbck may actually do damage going > against AMv2. > Fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19121) HBCK for AMv2 (A.K.A HBCK2)
[ https://issues.apache.org/jira/browse/HBASE-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1689#comment-1689 ] Jingyun Tian commented on HBASE-19121: -- [~stack] planning to build 2 tools as your doc already mentioned: # dump a list of stuck procedures as txt. # dump a list of RIT as txt. Should we build these tools in Master UI or Canary tools? If we build this in Master UI, it's easier for operator to use. But if Master UI is not up, it's unavailable (This situation should be rare?). Or build them in Canary tools? Please let me know your thoughts. > HBCK for AMv2 (A.K.A HBCK2) > --- > > Key: HBASE-19121 > URL: https://issues.apache.org/jira/browse/HBASE-19121 > Project: HBase > Issue Type: Umbrella > Components: hbck, hbck2 >Reporter: stack >Assignee: Umesh Agashe >Priority: Major > Fix For: hbck2-1.0.0 > > Attachments: hbase-19121.master.001.patch > > > We don't have an hbck for the new AM. Old hbck may actually do damage going > against AMv2. > Fix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21374) Backport HBASE-21342 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mazhenlin updated HBASE-21374: -- Status: Patch Available (was: Open) > Backport HBASE-21342 to branch-1 > > > Key: HBASE-21374 > URL: https://issues.apache.org/jira/browse/HBASE-21374 > Project: HBase > Issue Type: Task >Reporter: Mike Drob >Assignee: mazhenlin >Priority: Major > Attachments: HBASE-21374.branch-1.001.patch, > HBASE-21374.branch-1.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21374) Backport HBASE-21342 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-21374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mazhenlin updated HBASE-21374: -- Attachment: HBASE-21374.branch-1.002.patch > Backport HBASE-21342 to branch-1 > > > Key: HBASE-21374 > URL: https://issues.apache.org/jira/browse/HBASE-21374 > Project: HBase > Issue Type: Task >Reporter: Mike Drob >Assignee: mazhenlin >Priority: Major > Attachments: HBASE-21374.branch-1.001.patch, > HBASE-21374.branch-1.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21401) Sanity check in BaseDecoder#parseCell
[ https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21401: -- Component/s: regionserver > Sanity check in BaseDecoder#parseCell > - > > Key: HBASE-21401 > URL: https://issues.apache.org/jira/browse/HBASE-21401 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Critical > Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2 > > Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch > > > In KeyValueDecoder & ByteBuffKeyValueDecoder, we pass a byte buffer to > initialize the Cell without a sanity check (check each field's offset&len > exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen > when read the cell's fields, such as HBASE-213, it's hard to debug this kind > of bug. > An earlier check will help to find such kind of bugs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21401) Sanity check in BaseDecoder#parseCell
[ https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21401: -- Priority: Critical (was: Major) > Sanity check in BaseDecoder#parseCell > - > > Key: HBASE-21401 > URL: https://issues.apache.org/jira/browse/HBASE-21401 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Critical > Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2 > > Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch > > > In KeyValueDecoder & ByteBuffKeyValueDecoder, we pass a byte buffer to > initialize the Cell without a sanity check (check each field's offset&len > exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen > when read the cell's fields, such as HBASE-213, it's hard to debug this kind > of bug. > An earlier check will help to find such kind of bugs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21401) Sanity check in BaseDecoder#parseCell
[ https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1671#comment-1671 ] Zheng Hu commented on HBASE-21401: -- For UT TestMobDataBlockEncoding#testDataBlockEncoding, all encoding except ROW_INDEX_V1 works fine, need to found out what's wrong with the ROW_INDEX_V1... > Sanity check in BaseDecoder#parseCell > - > > Key: HBASE-21401 > URL: https://issues.apache.org/jira/browse/HBASE-21401 > Project: HBase > Issue Type: Sub-task >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2 > > Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch > > > In KeyValueDecoder & ByteBuffKeyValueDecoder, we pass a byte buffer to > initialize the Cell without a sanity check (check each field's offset&len > exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen > when read the cell's fields, such as HBASE-213, it's hard to debug this kind > of bug. > An earlier check will help to find such kind of bugs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere
[ https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1669#comment-1669 ] Guanghao Zhang commented on HBASE-21325: Pushed to master and branch-2. Thanks [~Apache9] for reviewing. And ping [~stack] for branch-2.1 and branch-2.0. Reopen this to backport to branch-2.1 and branch-2.0 after you released 2.1.1. Thanks. > Force to terminate regionserver when abort hang in somewhere > > > Key: HBASE-21325 > URL: https://issues.apache.org/jira/browse/HBASE-21325 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2 >Reporter: Duo Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21325.master.001.patch, > HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, > HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, > HBASE-21325.master.005.patch > > > When testing sync replication, I found that, if I transit the remote cluster > to DA, while the local cluster is still in A, the region server will hang > when shutdown. As the fsOk flag only test the local cluster(which is > reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is > broken(the remote wal directory is gone) so we will never succeed. And this > lead to an infinite wait inside waitOnAllRegionsToClose. > So I think here we should have an upper bound for the wait time in > waitOnAllRegionsToClose method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere
[ https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-21325: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Force to terminate regionserver when abort hang in somewhere > > > Key: HBASE-21325 > URL: https://issues.apache.org/jira/browse/HBASE-21325 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2 >Reporter: Duo Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21325.master.001.patch, > HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, > HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, > HBASE-21325.master.005.patch > > > When testing sync replication, I found that, if I transit the remote cluster > to DA, while the local cluster is still in A, the region server will hang > when shutdown. As the fsOk flag only test the local cluster(which is > reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is > broken(the remote wal directory is gone) so we will never succeed. And this > lead to an infinite wait inside waitOnAllRegionsToClose. > So I think here we should have an upper bound for the wait time in > waitOnAllRegionsToClose method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere
[ https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-21325: --- Affects Version/s: 2.1.1 2.2.0 3.0.0 2.0.2 > Force to terminate regionserver when abort hang in somewhere > > > Key: HBASE-21325 > URL: https://issues.apache.org/jira/browse/HBASE-21325 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2 >Reporter: Duo Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21325.master.001.patch, > HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, > HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, > HBASE-21325.master.005.patch > > > When testing sync replication, I found that, if I transit the remote cluster > to DA, while the local cluster is still in A, the region server will hang > when shutdown. As the fsOk flag only test the local cluster(which is > reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is > broken(the remote wal directory is gone) so we will never succeed. And this > lead to an infinite wait inside waitOnAllRegionsToClose. > So I think here we should have an upper bound for the wait time in > waitOnAllRegionsToClose method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere
[ https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-21325: --- Fix Version/s: 2.2.0 3.0.0 > Force to terminate regionserver when abort hang in somewhere > > > Key: HBASE-21325 > URL: https://issues.apache.org/jira/browse/HBASE-21325 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2 >Reporter: Duo Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21325.master.001.patch, > HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, > HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, > HBASE-21325.master.005.patch > > > When testing sync replication, I found that, if I transit the remote cluster > to DA, while the local cluster is still in A, the region server will hang > when shutdown. As the fsOk flag only test the local cluster(which is > reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is > broken(the remote wal directory is gone) so we will never succeed. And this > lead to an infinite wait inside waitOnAllRegionsToClose. > So I think here we should have an upper bound for the wait time in > waitOnAllRegionsToClose method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere
[ https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-21325: --- Release Note: Add two new config hbase.regionserver.abort.timeout and hbase.regionserver.abort.timeout.task. If regionserver abort timeout, it will schedule an abort timeout task to run. The default abort task is SystemExitWhenAbortTimeout, which will force to terminate region server when abort timeout. And you can config a special abort timeout task by hbase.regionserver.abort.timeout.task. > Force to terminate regionserver when abort hang in somewhere > > > Key: HBASE-21325 > URL: https://issues.apache.org/jira/browse/HBASE-21325 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.2.0, 2.1.1, 2.0.2 >Reporter: Duo Zhang >Assignee: Guanghao Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21325.master.001.patch, > HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, > HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, > HBASE-21325.master.005.patch > > > When testing sync replication, I found that, if I transit the remote cluster > to DA, while the local cluster is still in A, the region server will hang > when shutdown. As the fsOk flag only test the local cluster(which is > reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is > broken(the remote wal directory is gone) so we will never succeed. And this > lead to an infinite wait inside waitOnAllRegionsToClose. > So I think here we should have an upper bound for the wait time in > waitOnAllRegionsToClose method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21401) Sanity check in BaseDecoder#parseCell
[ https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1653#comment-1653 ] Zheng Hu commented on HBASE-21401: -- Still some UT failed, such as: 1. TestTags#testFlushAndCompactionwithCombinations 2. TestMobDataBlockEncoding#testDataBlockEncoding The timeout TestAsyncQuotaAdminApi & TestReplicationSyncUpToolWithMultipleAsyncWAL has no relationship with this issue. > Sanity check in BaseDecoder#parseCell > - > > Key: HBASE-21401 > URL: https://issues.apache.org/jira/browse/HBASE-21401 > Project: HBase > Issue Type: Sub-task >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2 > > Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch > > > In KeyValueDecoder & ByteBuffKeyValueDecoder, we pass a byte buffer to > initialize the Cell without a sanity check (check each field's offset&len > exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen > when read the cell's fields, such as HBASE-213, it's hard to debug this kind > of bug. > An earlier check will help to find such kind of bugs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21375) Revisit the lock and queue implementation in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1648#comment-1648 ] Allan Yang commented on HBASE-21375: My mistake, I misunderstood the behave, the worker will find one executable procedure and execute it, leaving others in the queue to other workers. Then, no other concern, +1 for the patch > Revisit the lock and queue implementation in MasterProcedureScheduler > - > > Key: HBASE-21375 > URL: https://issues.apache.org/jira/browse/HBASE-21375 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21375-UT.patch, HBASE-21375-UT2.patch, > HBASE-21375-v1.patch, HBASE-21375-v2.patch, HBASE-21375.patch > > > The problem for the old implementation is that we will only check the first > procedure in a queue to see if it could run, if it can not, we will remove > the queue from run queue. So when adding procedure to the scheduler, we have > to try hard to put the procedure which can be executed in front of the queue, > if there are corner cases where we fail to do so, it will likely lead to a > dead lock, that's why we have the tricky code when loading procedures and try > to add them into the scheduler, and also lots of 'if' in the doAdd method of > MasterProcedureScheduler. But this is still not enough to make things right, > so finally [~allan163] and I decided to change the logic in doPoll method, > where we use a loop to find whether there is a procedure can be executed, not > only the first one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21375) Revisit the lock and queue implementation in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1646#comment-1646 ] Duo Zhang commented on HBASE-21375: --- I do not think iterating the queue or not will effect whether a TableQueue can be executed by multiple workers or not? A worker will iterate the queue but finally it will poll a procedure and return, then other workers can still poll from the queue. > Revisit the lock and queue implementation in MasterProcedureScheduler > - > > Key: HBASE-21375 > URL: https://issues.apache.org/jira/browse/HBASE-21375 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21375-UT.patch, HBASE-21375-UT2.patch, > HBASE-21375-v1.patch, HBASE-21375-v2.patch, HBASE-21375.patch > > > The problem for the old implementation is that we will only check the first > procedure in a queue to see if it could run, if it can not, we will remove > the queue from run queue. So when adding procedure to the scheduler, we have > to try hard to put the procedure which can be executed in front of the queue, > if there are corner cases where we fail to do so, it will likely lead to a > dead lock, that's why we have the tricky code when loading procedures and try > to add them into the scheduler, and also lots of 'if' in the doAdd method of > MasterProcedureScheduler. But this is still not enough to make things right, > so finally [~allan163] and I decided to change the logic in doPoll method, > where we use a loop to find whether there is a procedure can be executed, not > only the first one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21389) Revisit the procedure lock for sync replication
[ https://issues.apache.org/jira/browse/HBASE-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1438#comment-1438 ] Hadoop QA commented on HBASE-21389: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 38s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 25s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 26s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 3s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}129m 33s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}173m 7s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21389 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12945963/HBASE-21389-v1.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux fbde65b694c6 4.4.0-134-generic #160~14.04.1-Ubuntu SMP Fri Aug 17 11:07:07 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 7cdb525192 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14890/testReport/ | | Max. process+thread count | 4609 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE
[jira] [Commented] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on
[ https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1417#comment-1417 ] Hadoop QA commented on HBASE-21395: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-2.0 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 50s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 1s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} branch-2.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 59s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 8m 19s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}119m 48s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}155m 16s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 | | JIRA Issue | HBASE-21395 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12945962/HBASE-21395.branch-2.0.003.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 094956d856f9 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2.0 / a3b2686114 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14889/testReport/ | | Max. process+thread count | 4245 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https
[jira] [Commented] (HBASE-21375) Revisit the lock and queue implementation in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1392#comment-1392 ] Allan Yang commented on HBASE-21375: I have a concern that before, one table's region operations can be executed by several workers concurrently, but now, since one worker will iterate over the TableQueue, it will execute the procedures one by one. If it is a very big table, the modify table maybe not tolerable. > Revisit the lock and queue implementation in MasterProcedureScheduler > - > > Key: HBASE-21375 > URL: https://issues.apache.org/jira/browse/HBASE-21375 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21375-UT.patch, HBASE-21375-UT2.patch, > HBASE-21375-v1.patch, HBASE-21375-v2.patch, HBASE-21375.patch > > > The problem for the old implementation is that we will only check the first > procedure in a queue to see if it could run, if it can not, we will remove > the queue from run queue. So when adding procedure to the scheduler, we have > to try hard to put the procedure which can be executed in front of the queue, > if there are corner cases where we fail to do so, it will likely lead to a > dead lock, that's why we have the tricky code when loading procedures and try > to add them into the scheduler, and also lots of 'if' in the doAdd method of > MasterProcedureScheduler. But this is still not enough to make things right, > so finally [~allan163] and I decided to change the logic in doPoll method, > where we use a loop to find whether there is a procedure can be executed, not > only the first one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21401) Sanity check in BaseDecoder#parseCell
[ https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1388#comment-1388 ] Hadoop QA commented on HBASE-21401: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 30s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 40s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 54s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 32s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 48s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 35s{color} | {color:red} hbase-common generated 2 new + 42 unchanged - 0 fixed = 44 total (was 42) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 30s{color} | {color:red} hbase-common: The patch generated 30 new + 148 unchanged - 1 fixed = 178 total (was 149) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 14s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 13m 15s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 41s{color} | {color:green} hbase-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}233m 15s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 50s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}296m 14s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.client.TestAsyncQuotaAdminApi | | | hadoop.hbase.regionserver.TestTags | | | hadoop.hbase.mob.TestMobDataBlockEncoding | | | hadoop.hbase.replication.multiwal.TestReplicationSyncUpToolWithMultipleAsyncWAL | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21401 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12945952/HBASE-21401.v2.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 0d9b0f7101ca 3.13.0-153-generi
[jira] [Updated] (HBASE-21389) Revisit the procedure lock for sync replication
[ https://issues.apache.org/jira/browse/HBASE-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-21389: -- Attachment: HBASE-21389-v1.patch > Revisit the procedure lock for sync replication > --- > > Key: HBASE-21389 > URL: https://issues.apache.org/jira/browse/HBASE-21389 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Replication >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-21389-v1.patch, HBASE-21389-v1.patch, > HBASE-21389.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21375) Revisit the lock and queue implementation in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1384#comment-1384 ] Duo Zhang commented on HBASE-21375: --- Any other concerns? [~stack] [~allan163] Thanks. > Revisit the lock and queue implementation in MasterProcedureScheduler > - > > Key: HBASE-21375 > URL: https://issues.apache.org/jira/browse/HBASE-21375 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21375-UT.patch, HBASE-21375-UT2.patch, > HBASE-21375-v1.patch, HBASE-21375-v2.patch, HBASE-21375.patch > > > The problem for the old implementation is that we will only check the first > procedure in a queue to see if it could run, if it can not, we will remove > the queue from run queue. So when adding procedure to the scheduler, we have > to try hard to put the procedure which can be executed in front of the queue, > if there are corner cases where we fail to do so, it will likely lead to a > dead lock, that's why we have the tricky code when loading procedures and try > to add them into the scheduler, and also lots of 'if' in the doAdd method of > MasterProcedureScheduler. But this is still not enough to make things right, > so finally [~allan163] and I decided to change the logic in doPoll method, > where we use a loop to find whether there is a procedure can be executed, not > only the first one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21325) Force to terminate regionserver when abort hang in somewhere
[ https://issues.apache.org/jira/browse/HBASE-21325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1383#comment-1383 ] Duo Zhang commented on HBASE-21325: --- +1. > Force to terminate regionserver when abort hang in somewhere > > > Key: HBASE-21325 > URL: https://issues.apache.org/jira/browse/HBASE-21325 > Project: HBase > Issue Type: Improvement >Reporter: Duo Zhang >Assignee: Guanghao Zhang >Priority: Major > Attachments: HBASE-21325.master.001.patch, > HBASE-21325.master.001.patch, HBASE-21325.master.002.patch, > HBASE-21325.master.003.patch, HBASE-21325.master.004.patch, > HBASE-21325.master.005.patch > > > When testing sync replication, I found that, if I transit the remote cluster > to DA, while the local cluster is still in A, the region server will hang > when shutdown. As the fsOk flag only test the local cluster(which is > reasonable), we will enter the waitOnAllRegionsToClose, and since the WAL is > broken(the remote wal directory is gone) so we will never succeed. And this > lead to an infinite wait inside waitOnAllRegionsToClose. > So I think here we should have an upper bound for the wait time in > waitOnAllRegionsToClose method. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21395) Abort split/merge procedure if there is a table procedure of the same table going on
[ https://issues.apache.org/jira/browse/HBASE-21395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-21395: --- Attachment: HBASE-21395.branch-2.0.003.patch > Abort split/merge procedure if there is a table procedure of the same table > going on > > > Key: HBASE-21395 > URL: https://issues.apache.org/jira/browse/HBASE-21395 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.1.0, 2.0.2 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 2.1.2 > > Attachments: HBASE-21395.branch-2.0.001.patch, > HBASE-21395.branch-2.0.002.patch, HBASE-21395.branch-2.0.003.patch > > > In my ITBLL, I often see that if split/merge procedure and table > procedure(like ModifyTableProcedure) happen at the same time, and since there > some race conditions between these two kind of procedures, causing some > serious problems. e.g. the split/merged parent is bought on line by the table > procedure or the split merged region making the whole table procedure > rollback. > Talked with [~Apache9] offline today, this kind of problem was solved in > branch-2+ since There is a fence that only one RTSP can agianst a single > region at the same time. > To keep out of the mess in branch-2.0 and branch-2.1, I added a simple safe > fence in the split/merge procedure: If there is a table procedure going on > against the same table, then abort the split/merge procedure. Aborting the > split/merge procedure at the beginning of the execution is no big deal, > compared with the mess it will cause... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21401) Sanity check in BaseDecoder#parseCell
[ https://issues.apache.org/jira/browse/HBASE-21401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zheng Hu updated HBASE-21401: - Attachment: HBASE-21401.v2.patch > Sanity check in BaseDecoder#parseCell > - > > Key: HBASE-21401 > URL: https://issues.apache.org/jira/browse/HBASE-21401 > Project: HBase > Issue Type: Sub-task >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.0.3, 2.1.2 > > Attachments: HBASE-21401.v1.patch, HBASE-21401.v2.patch > > > In KeyValueDecoder & ByteBuffKeyValueDecoder, we pass a byte buffer to > initialize the Cell without a sanity check (check each field's offset&len > exceed the byte buffer or not), so ArrayIndexOutOfBoundsException may happen > when read the cell's fields, such as HBASE-213, it's hard to debug this kind > of bug. > An earlier check will help to find such kind of bugs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)