[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168631#comment-17168631 ] Hudson commented on HBASE-20226: Results for branch branch-1 [build #1335 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1335/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1335//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1335//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1335//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Assignee: Bharath Vissapragada >Priority: Minor > Labels: perfomance > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.4.0, 2.2.7 > > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168283#comment-17168283 ] Hudson commented on HBASE-20226: Results for branch branch-2.2 [build #922 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/922/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/922//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/922//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/922//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Assignee: Bharath Vissapragada >Priority: Minor > Labels: perfomance > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.4.0, 2.2.7 > > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167991#comment-17167991 ] Hudson commented on HBASE-20226: Results for branch master [build #1798 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/1798/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/1798/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/1798/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://builds.apache.org/job/HBase%20Nightly/job/master/1798/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} --Failed when running client tests on top of Hadoop 2. [see log for details|https://builds.apache.org/job/HBase%20Nightly/job/master/1798//artifact/output-integration/hadoop-2.log]. (note that this means we didn't run on Hadoop 3) > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Assignee: Bharath Vissapragada >Priority: Minor > Labels: perfomance > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.2.7 > > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167978#comment-17167978 ] Hudson commented on HBASE-20226: Results for branch branch-2.3 [build #195 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/195/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/195/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/195/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/195/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/195/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Assignee: Bharath Vissapragada >Priority: Minor > Labels: perfomance > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.2.7 > > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167634#comment-17167634 ] Hudson commented on HBASE-20226: Results for branch branch-2 [build #2762 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2762/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2762/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2756/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2742/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2757/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Assignee: Bharath Vissapragada >Priority: Minor > Labels: perfomance > Fix For: 3.0.0-alpha-1, 2.3.1, 1.7.0, 2.2.7 > > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166636#comment-17166636 ] Bharath Vissapragada commented on HBASE-20226: -- [~zyork] Thanks for the quick clarification. We use a "fast" FS (non-S3) but the delete is still choked for some HDFS side reason and we clearly aren't saturating the handlers and CPU on the namenode side. That is very much visible in these deletes that are sequential. So the attempt here is to parallelize that. I put up the patch [here|https://github.com/apache/hbase/pull/2159], you have a few cycles to take look? > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Assignee: Bharath Vissapragada >Priority: Minor > Labels: perfomance > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166618#comment-17166618 ] Zach York commented on HBASE-20226: --- [~bharathv] This improvement can still be valid if you are using a slow filesystem. The linked JIRA solves this issue by allowing the user to move to a faster filesystem for temporary storage, but for people who don't want that/can't do that, maybe this improvement still makes sense > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.7.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Assignee: Bharath Vissapragada >Priority: Minor > Labels: perfomance > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166548#comment-17166548 ] Bharath Vissapragada commented on HBASE-20226: -- Thanks for taking a look [~yuzhih...@gmail.com]. I've [cleaned up|https://github.com/apache/hbase/pull/2159] the patch a bit.. > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 1.4.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Priority: Minor > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166082#comment-17166082 ] Ted Yu commented on HBASE-20226: {code} +if (v1Regions.size() > 0 || v2Regions.size() > 0) { {code} It seems the thread pool is needed when v1Regions.size()+v2Regions.size() > 1. There are also a few findbugs warnings to be addressed. > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 1.4.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Priority: Minor > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166068#comment-17166068 ] Bharath Vissapragada commented on HBASE-20226: -- bq. This is related to the work going on in HBASE-21098. It's likely that this Jira is no longer needed after that work is concluded. [~zyork] Why is this no longer needed? I think the attached patch parallelizes deletes by adding them to a thread pool while the parent jira (HBASE-21098) doesn't do it right? We are running into bottlenecks on this sequential delete and I saw this jira. Did I miss something? > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 1.4.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Priority: Minor > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605110#comment-16605110 ] Hadoop QA commented on HBASE-20226: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 45s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 22s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 15s{color} | {color:red} hbase-server: The patch generated 13 new + 5 unchanged - 0 fixed = 18 total (was 5) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 18s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 41s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 12s{color} | {color:red} hbase-server generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 51s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 12s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 64m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | v2Regions could be null and is guaranteed to be dereferenced in org.apache.hadoop.hbase.snapshot.SnapshotManifest.convertToV2SingleManifest() Dereferenced at SnapshotManifest.java:is guaranteed to be dereferenced in org.apache.hadoop.hbase.snapshot.SnapshotManifest.convertToV2SingleManifest() Dereferenced at SnapshotManifest.java:[line 516] | | | Possible null pointer dereference of v1Regions in org.apache.hadoop.hbase.snapshot.SnapshotManifest.convertToV2SingleManifest() Dereferenced at SnapshotManifest.java:v1Regions in org.apache.hadoop.hbase.snapshot.SnapshotManifest.convertToV2SingleManifest() Dereferenced at SnapshotManifest.java:[line 516] | | | Nullcheck of v1Regions at line 516 of value previously dereferenced in org.apache.hadoop.hbase.snapshot.SnapshotManifest.convertToV2SingleManifest() At SnapshotManifest.java:516 of value previously dereferenced in
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16605053#comment-16605053 ] Zach York commented on HBASE-20226: --- This is related to the work going on in HBASE-21098. It's likely that this Jira is no longer needed after that work is concluded. > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 1.4.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Priority: Minor > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411165#comment-16411165 ] Steve Loughran commented on HBASE-20226: Amazon throttle DELETE to the same shared, so speedup will be sublinear, even though the cost of a delete/bulk delete is low in terms of network traffic. f you are doing bulk deletes in > 1 it's probably best to do a bit of shuffling of the list of directories to delete before queuing the operations. > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 1.4.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Priority: Minor > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406488#comment-16406488 ] Ted Yu commented on HBASE-20226: {code} +if (v1Regions.size() > 0 || v2Regions.size() > 0) { {code} I think you may tighten the above condition by checking the sum of the sizes. {code} + ThreadPoolExecutor tpoolDelete = createExecutor("SnapshotRegionManifestDeletePool"); {code} where: {code} public static ThreadPoolExecutor createExecutor(final Configuration conf, final String name) { int maxThreads = conf.getInt("hbase.snapshot.thread.pool.max", 8); {code} You can add new config, instead of depending on the existing config above. > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 1.4.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Priority: Minor > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406469#comment-16406469 ] Ted Yu commented on HBASE-20226: The following is reproducible : {code} testReadSnapshotManifest(org.apache.hadoop.hbase.snapshot.TestSnapshotManifest) Time elapsed: 0.064 sec <<< ERROR! java.lang.NullPointerException at org.apache.hadoop.hbase.snapshot.TestSnapshotManifest.setup(TestSnapshotManifest.java:83) {code} See if fixing FindBugs warnings would avoid the above exception. > Performance Improvement Taking Large Snapshots In Remote Filesystems > > > Key: HBASE-20226 > URL: https://issues.apache.org/jira/browse/HBASE-20226 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 1.4.0 > Environment: HBase 1.4.0 running on an AWS EMR cluster with the > hbase.rootdir set to point to a folder in S3 >Reporter: Saad Mufti >Priority: Minor > Attachments: HBASE-20226..01.patch > > > When taking a snapshot of any table, one of the last steps is to delete the > region manifests, which have already been rolled up into a larger overall > manifest and thus have redundant information. > This proposal is to do the deletion in a thread pool bounded by > hbase.snapshot.thread.pool.max . For large tables with a lot of regions, the > current single threaded deletion is taking longer than all the rest of the > snapshot tasks when the Hbase data and the snapshot folder are both in a > remote filesystem like S3. > I have a patch for this proposal almost ready and will submit it tomorrow for > feedback, although I haven't had a chance to write any tests yet. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20226) Performance Improvement Taking Large Snapshots In Remote Filesystems
[ https://issues.apache.org/jira/browse/HBASE-20226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16406238#comment-16406238 ] Hadoop QA commented on HBASE-20226: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 7s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 4s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 58s{color} | {color:red} hbase-server: The patch generated 13 new + 5 unchanged - 0 fixed = 18 total (was 5) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 5s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 16m 41s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 3s{color} | {color:red} hbase-server generated 3 new + 0 unchanged - 0 fixed = 3 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 58s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 9s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 59m 16s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | v2Regions could be null and is guaranteed to be dereferenced in org.apache.hadoop.hbase.snapshot.SnapshotManifest.convertToV2SingleManifest() Dereferenced at SnapshotManifest.java:is guaranteed to be dereferenced in org.apache.hadoop.hbase.snapshot.SnapshotManifest.convertToV2SingleManifest() Dereferenced at SnapshotManifest.java:[line 549] | | | Possible null pointer dereference of v1Regions in org.apache.hadoop.hbase.snapshot.SnapshotManifest.convertToV2SingleManifest() Dereferenced at SnapshotManifest.java:v1Regions in org.apache.hadoop.hbase.snapshot.SnapshotManifest.convertToV2SingleManifest() Dereferenced at SnapshotManifest.java:[line 516] | | | Nullcheck of v1Regions at line 516 of value previously dereferenced in org.apache.hadoop.hbase.snapshot.SnapshotManifest.convertToV2SingleManifest() At SnapshotManifest.java:516 of value previously dereferenced in