[jira] [Commented] (HBASE-28082) oldWALs naming can be incompatible with HBase backup
[ https://issues.apache.org/jira/browse/HBASE-28082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832931#comment-17832931 ] Bryan Beaudreault commented on HBASE-28082: --- I lost track of this, but just ran into the issue in our environment and was pleased to find it solved. Thanks [~janvanbesien]! > oldWALs naming can be incompatible with HBase backup > > > Key: HBASE-28082 > URL: https://issues.apache.org/jira/browse/HBASE-28082 > Project: HBase > Issue Type: Bug > Components: backup&restore > Environment: Encountered on HBase > a2e7d2015e9f603e46339d0582e29a86843b9324 (branch-2), running in Kubernetes. >Reporter: Dieter De Paepe >Assignee: Jan Van Besien >Priority: Major > Fix For: 2.6.0, 3.0.0-beta-1 > > > I am testing HBase backup functionality, and noticed following warning when > running "hbase backup create incremental ...": > > {noformat} > 23/09/13 15:44:10 WARN org.apache.hadoop.hbase.backup.util.BackupUtils: Skip > log file (can't parse): > hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312{noformat} > It appears in my setup, the oldWALs are indeed given names that seem to break > "ServerName.valueOf(s)" in "BackupUtils#parseHostFromOldLog(Path p)": > > > {noformat} > user@hadoop-client-769bc9946-xqrt2:/$ hdfs dfs -ls hdfs:///hbase/hbase/oldWALs > Found 42 items > -rw-r--r-- 1 hbase hbase 775421 2023-09-13 13:14 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694609957984$masterlocalwal$ > -rw-r--r-- 1 hbase hbase 26059 2023-09-13 13:29 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694610867894$masterlocalwal$ > ... > -rw-r--r-- 1 hbase hbase 242479 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312 > -rw-r--r-- 1 hbase hbase 4364 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694610188654 > ... > -rw-r--r-- 1 hbase hbase 70802 2023-09-13 13:15 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694609970025.meta > -rw-r--r-- 1 hbase hbase 93 2023-09-13 13:04 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694610188627.meta > ...{noformat} > I'd say this is not a bug in the backup system, but rather in whatever gives > the oldWAL files its name. I'm however not that familiar with HBase code to > find where these files are created. Any pointers are appreciated. > Given that this causes some logs to be missed during backup, I guess this can > lead to data loss in a backup restore? > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28082) oldWALs naming can be incompatible with HBase backup
[ https://issues.apache.org/jira/browse/HBASE-28082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17772928#comment-17772928 ] Hudson commented on HBASE-28082: Results for branch branch-2 [build #900 on builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/900/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/900/General_20Nightly_20Build_20Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/900/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/900/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2/900/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > oldWALs naming can be incompatible with HBase backup > > > Key: HBASE-28082 > URL: https://issues.apache.org/jira/browse/HBASE-28082 > Project: HBase > Issue Type: Bug > Components: backup&restore > Environment: Encountered on HBase > a2e7d2015e9f603e46339d0582e29a86843b9324 (branch-2), running in Kubernetes. >Reporter: Dieter De Paepe >Assignee: Jan Van Besien >Priority: Major > Fix For: 2.6.0, 3.0.0-beta-1 > > > I am testing HBase backup functionality, and noticed following warning when > running "hbase backup create incremental ...": > > {noformat} > 23/09/13 15:44:10 WARN org.apache.hadoop.hbase.backup.util.BackupUtils: Skip > log file (can't parse): > hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312{noformat} > It appears in my setup, the oldWALs are indeed given names that seem to break > "ServerName.valueOf(s)" in "BackupUtils#parseHostFromOldLog(Path p)": > > > {noformat} > user@hadoop-client-769bc9946-xqrt2:/$ hdfs dfs -ls hdfs:///hbase/hbase/oldWALs > Found 42 items > -rw-r--r-- 1 hbase hbase 775421 2023-09-13 13:14 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694609957984$masterlocalwal$ > -rw-r--r-- 1 hbase hbase 26059 2023-09-13 13:29 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694610867894$masterlocalwal$ > ... > -rw-r--r-- 1 hbase hbase 242479 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312 > -rw-r--r-- 1 hbase hbase 4364 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694610188654 > ... > -rw-r--r-- 1 hbase hbase 70802 2023-09-13 13:15 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694609970025.meta > -rw-r--r-- 1 hbase hbase 93 2023-09-13 13:04 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694610188627.meta > ...{noformat} > I'd say this is not a bug in the backup system, but rather in whatever gives > the oldWAL files its name. I'm however not that familiar with HBase code to > find where these files are created. Any pointers are appreciated. > Given that this causes some logs to be missed during backup, I guess this can > lead to data loss in a backup restore? > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28082) oldWALs naming can be incompatible with HBase backup
[ https://issues.apache.org/jira/browse/HBASE-28082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17772863#comment-17772863 ] Hudson commented on HBASE-28082: Results for branch master [build #922 on builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/922/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/922/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/922/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/master/922/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > oldWALs naming can be incompatible with HBase backup > > > Key: HBASE-28082 > URL: https://issues.apache.org/jira/browse/HBASE-28082 > Project: HBase > Issue Type: Bug > Components: backup&restore > Environment: Encountered on HBase > a2e7d2015e9f603e46339d0582e29a86843b9324 (branch-2), running in Kubernetes. >Reporter: Dieter De Paepe >Assignee: Jan Van Besien >Priority: Major > Fix For: 2.6.0, 3.0.0-beta-1 > > > I am testing HBase backup functionality, and noticed following warning when > running "hbase backup create incremental ...": > > {noformat} > 23/09/13 15:44:10 WARN org.apache.hadoop.hbase.backup.util.BackupUtils: Skip > log file (can't parse): > hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312{noformat} > It appears in my setup, the oldWALs are indeed given names that seem to break > "ServerName.valueOf(s)" in "BackupUtils#parseHostFromOldLog(Path p)": > > > {noformat} > user@hadoop-client-769bc9946-xqrt2:/$ hdfs dfs -ls hdfs:///hbase/hbase/oldWALs > Found 42 items > -rw-r--r-- 1 hbase hbase 775421 2023-09-13 13:14 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694609957984$masterlocalwal$ > -rw-r--r-- 1 hbase hbase 26059 2023-09-13 13:29 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694610867894$masterlocalwal$ > ... > -rw-r--r-- 1 hbase hbase 242479 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312 > -rw-r--r-- 1 hbase hbase 4364 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694610188654 > ... > -rw-r--r-- 1 hbase hbase 70802 2023-09-13 13:15 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694609970025.meta > -rw-r--r-- 1 hbase hbase 93 2023-09-13 13:04 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694610188627.meta > ...{noformat} > I'd say this is not a bug in the backup system, but rather in whatever gives > the oldWAL files its name. I'm however not that familiar with HBase code to > find where these files are created. Any pointers are appreciated. > Given that this causes some logs to be missed during backup, I guess this can > lead to data loss in a backup restore? > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28082) oldWALs naming can be incompatible with HBase backup
[ https://issues.apache.org/jira/browse/HBASE-28082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17772857#comment-17772857 ] Hudson commented on HBASE-28082: Results for branch branch-3 [build #62 on builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/62/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/62/General_20Nightly_20Build_20Report/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/62/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/] (x) {color:red}-1 jdk11 hadoop3 checks{color} -- For more information [see jdk11 report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-3/62/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > oldWALs naming can be incompatible with HBase backup > > > Key: HBASE-28082 > URL: https://issues.apache.org/jira/browse/HBASE-28082 > Project: HBase > Issue Type: Bug > Components: backup&restore > Environment: Encountered on HBase > a2e7d2015e9f603e46339d0582e29a86843b9324 (branch-2), running in Kubernetes. >Reporter: Dieter De Paepe >Assignee: Jan Van Besien >Priority: Major > Fix For: 2.6.0, 3.0.0-beta-1 > > > I am testing HBase backup functionality, and noticed following warning when > running "hbase backup create incremental ...": > > {noformat} > 23/09/13 15:44:10 WARN org.apache.hadoop.hbase.backup.util.BackupUtils: Skip > log file (can't parse): > hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312{noformat} > It appears in my setup, the oldWALs are indeed given names that seem to break > "ServerName.valueOf(s)" in "BackupUtils#parseHostFromOldLog(Path p)": > > > {noformat} > user@hadoop-client-769bc9946-xqrt2:/$ hdfs dfs -ls hdfs:///hbase/hbase/oldWALs > Found 42 items > -rw-r--r-- 1 hbase hbase 775421 2023-09-13 13:14 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694609957984$masterlocalwal$ > -rw-r--r-- 1 hbase hbase 26059 2023-09-13 13:29 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694610867894$masterlocalwal$ > ... > -rw-r--r-- 1 hbase hbase 242479 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312 > -rw-r--r-- 1 hbase hbase 4364 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694610188654 > ... > -rw-r--r-- 1 hbase hbase 70802 2023-09-13 13:15 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694609970025.meta > -rw-r--r-- 1 hbase hbase 93 2023-09-13 13:04 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694610188627.meta > ...{noformat} > I'd say this is not a bug in the backup system, but rather in whatever gives > the oldWAL files its name. I'm however not that familiar with HBase code to > find where these files are created. Any pointers are appreciated. > Given that this causes some logs to be missed during backup, I guess this can > lead to data loss in a backup restore? > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28082) oldWALs naming can be incompatible with HBase backup
[ https://issues.apache.org/jira/browse/HBASE-28082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771045#comment-17771045 ] Jan Van Besien commented on HBASE-28082: okay, here it is: https://github.com/apache/hbase/pull/5445 > oldWALs naming can be incompatible with HBase backup > > > Key: HBASE-28082 > URL: https://issues.apache.org/jira/browse/HBASE-28082 > Project: HBase > Issue Type: Bug > Environment: Encountered on HBase > a2e7d2015e9f603e46339d0582e29a86843b9324 (branch-2), running in Kubernetes. >Reporter: Dieter De Paepe >Priority: Major > > I am testing HBase backup functionality, and noticed following warning when > running "hbase backup create incremental ...": > > {noformat} > 23/09/13 15:44:10 WARN org.apache.hadoop.hbase.backup.util.BackupUtils: Skip > log file (can't parse): > hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312{noformat} > It appears in my setup, the oldWALs are indeed given names that seem to break > "ServerName.valueOf(s)" in "BackupUtils#parseHostFromOldLog(Path p)": > > > {noformat} > user@hadoop-client-769bc9946-xqrt2:/$ hdfs dfs -ls hdfs:///hbase/hbase/oldWALs > Found 42 items > -rw-r--r-- 1 hbase hbase 775421 2023-09-13 13:14 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694609957984$masterlocalwal$ > -rw-r--r-- 1 hbase hbase 26059 2023-09-13 13:29 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694610867894$masterlocalwal$ > ... > -rw-r--r-- 1 hbase hbase 242479 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312 > -rw-r--r-- 1 hbase hbase 4364 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694610188654 > ... > -rw-r--r-- 1 hbase hbase 70802 2023-09-13 13:15 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694609970025.meta > -rw-r--r-- 1 hbase hbase 93 2023-09-13 13:04 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694610188627.meta > ...{noformat} > I'd say this is not a bug in the backup system, but rather in whatever gives > the oldWAL files its name. I'm however not that familiar with HBase code to > find where these files are created. Any pointers are appreciated. > Given that this causes some logs to be missed during backup, I guess this can > lead to data loss in a backup restore? > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28082) oldWALs naming can be incompatible with HBase backup
[ https://issues.apache.org/jira/browse/HBASE-28082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771044#comment-17771044 ] Bryan Beaudreault commented on HBASE-28082: --- The preferred way to contribute these days is by submitting a PR at https://github.com/apache/hbase > oldWALs naming can be incompatible with HBase backup > > > Key: HBASE-28082 > URL: https://issues.apache.org/jira/browse/HBASE-28082 > Project: HBase > Issue Type: Bug > Environment: Encountered on HBase > a2e7d2015e9f603e46339d0582e29a86843b9324 (branch-2), running in Kubernetes. >Reporter: Dieter De Paepe >Priority: Major > > I am testing HBase backup functionality, and noticed following warning when > running "hbase backup create incremental ...": > > {noformat} > 23/09/13 15:44:10 WARN org.apache.hadoop.hbase.backup.util.BackupUtils: Skip > log file (can't parse): > hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312{noformat} > It appears in my setup, the oldWALs are indeed given names that seem to break > "ServerName.valueOf(s)" in "BackupUtils#parseHostFromOldLog(Path p)": > > > {noformat} > user@hadoop-client-769bc9946-xqrt2:/$ hdfs dfs -ls hdfs:///hbase/hbase/oldWALs > Found 42 items > -rw-r--r-- 1 hbase hbase 775421 2023-09-13 13:14 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694609957984$masterlocalwal$ > -rw-r--r-- 1 hbase hbase 26059 2023-09-13 13:29 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694610867894$masterlocalwal$ > ... > -rw-r--r-- 1 hbase hbase 242479 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312 > -rw-r--r-- 1 hbase hbase 4364 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694610188654 > ... > -rw-r--r-- 1 hbase hbase 70802 2023-09-13 13:15 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694609970025.meta > -rw-r--r-- 1 hbase hbase 93 2023-09-13 13:04 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694610188627.meta > ...{noformat} > I'd say this is not a bug in the backup system, but rather in whatever gives > the oldWAL files its name. I'm however not that familiar with HBase code to > find where these files are created. Any pointers are appreciated. > Given that this causes some logs to be missed during backup, I guess this can > lead to data loss in a backup restore? > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HBASE-28082) oldWALs naming can be incompatible with HBase backup
[ https://issues.apache.org/jira/browse/HBASE-28082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771019#comment-17771019 ] Jan Van Besien commented on HBASE-28082: I have a patch that works by making less assumptions about the actual file name other than that it starts with a ServerName (host,port,...). Can't seem to attach patch files though. {code:java} >From 40d88d9253c78e04823af49f199684bd8ac03966 Mon Sep 17 00:00:00 2001 From: Jan Van Besien Date: Mon, 2 Oct 2023 11:07:59 +0200 Subject: [PATCH] HBASE-28082 more lenient WAL hostname parsing Make the hostname parsing in BackupUtils#parseHostFromOldLog more lenient by not making any assumptions about the name of the file other than that it starts with a org.apache.hadoop.hbase.ServerName. --- .../hadoop/hbase/backup/util/BackupUtils.java | 10 + .../hadoop/hbase/backup/TestBackupUtils.java | 22 +++ 2 files changed, 19 insertions(+), 13 deletions(-) diff --git a/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/util/BackupUtils.java b/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/util/BackupUtils.java index 5be8eed3952..a920b55bca9 100644 --- a/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/util/BackupUtils.java +++ b/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/util/BackupUtils.java @@ -30,6 +30,7 @@ import java.util.Map; import java.util.Map.Entry; import java.util.TreeMap; import java.util.TreeSet; +import com.google.common.collect.Iterables; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileStatus; @@ -365,10 +366,11 @@ public final class BackupUtils { return null; } try { - String n = p.getName(); - int idx = n.lastIndexOf(LOGNAME_SEPARATOR); - String s = URLDecoder.decode(n.substring(0, idx), "UTF8"); - return ServerName.valueOf(s).getAddress().toString(); + String urlDecodedName = URLDecoder.decode(p.getName(), "UTF8"); + Iterable nameSplitsOnComma = Splitter.on(",").split(urlDecodedName); + String host = Iterables.get(nameSplitsOnComma, 0); + String port = Iterables.get(nameSplitsOnComma, 1); + return host + ":" + port; } catch (Exception e) { LOG.warn("Skip log file (can't parse): {}", p); return null; diff --git a/hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestBackupUtils.java b/hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestBackupUtils.java index 831ec309cfc..6aebe8db082 100644 --- a/hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestBackupUtils.java +++ b/hbase-backup/src/test/java/org/apache/hadoop/hbase/backup/TestBackupUtils.java @@ -87,21 +87,25 @@ public class TestBackupUtils { @Test public void testFilesystemWalHostNameParsing() throws IOException { -String host = "localhost"; +String host = "a-region-server.domain.com"; int port = 60030; ServerName serverName = ServerName.valueOf(host, port, 1234); Path walRootDir = CommonFSUtils.getWALRootDir(conf); Path oldLogDir = new Path(walRootDir, HConstants.HREGION_OLDLOGDIR_NAME); -Path testWalPath = new Path(oldLogDir, - serverName.toString() + BackupUtils.LOGNAME_SEPARATOR + EnvironmentEdgeManager.currentTime()); -Path testMasterWalPath = - new Path(oldLogDir, testWalPath.getName() + MasterRegionFactory.ARCHIVED_WAL_SUFFIX); +Path testOldWalPath = new Path(oldLogDir, + serverName + BackupUtils.LOGNAME_SEPARATOR + EnvironmentEdgeManager.currentTime()); +Assert.assertEquals(host + Addressing.HOSTNAME_PORT_SEPARATOR + port, BackupUtils.parseHostFromOldLog(testOldWalPath)); -String parsedHost = BackupUtils.parseHostFromOldLog(testMasterWalPath); -Assert.assertNull(parsedHost); +Path testMasterWalPath = + new Path(oldLogDir, testOldWalPath.getName() + MasterRegionFactory.ARCHIVED_WAL_SUFFIX); +Assert.assertNull(BackupUtils.parseHostFromOldLog(testMasterWalPath)); -parsedHost = BackupUtils.parseHostFromOldLog(testWalPath); -Assert.assertEquals(parsedHost, host + Addressing.HOSTNAME_PORT_SEPARATOR + port); +// org.apache.hadoop.hbase.wal.BoundedGroupingStrategy does this +Path testOldWalWithRegionGroupingPath = new Path(oldLogDir, + serverName + BackupUtils.LOGNAME_SEPARATOR + serverName + +BackupUtils.LOGNAME_SEPARATOR + "regiongroup-0" + BackupUtils.LOGNAME_SEPARATOR + +EnvironmentEdgeManager.currentTime()); +Assert.assertEquals(host + Addressing.HOSTNAME_PORT_SEPARATOR + port, BackupUtils.parseHostFromOldLog(testOldWalWithRegionGroupingPath)); } } -- 2.41.0 {code} > oldWALs naming can be incompatible with HBase backup > > > Key: HBASE-28082 > URL: https://issues.apache.org/jira/brow
[jira] [Commented] (HBASE-28082) oldWALs naming can be incompatible with HBase backup
[ https://issues.apache.org/jira/browse/HBASE-28082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17764782#comment-17764782 ] Bryan Beaudreault commented on HBASE-28082: --- I think the problem is related to your usage of multiwal for hbase.wal.provider. It seems like that feature adds a ".regiongroup-#" suffix to the WAL path name. I do think the bug is in the backup system, we should fix BackupUtils#parseHostFromOldLog to ignore that before trying to extract a ServerName. > oldWALs naming can be incompatible with HBase backup > > > Key: HBASE-28082 > URL: https://issues.apache.org/jira/browse/HBASE-28082 > Project: HBase > Issue Type: Bug > Environment: Encountered on HBase > a2e7d2015e9f603e46339d0582e29a86843b9324 (branch-2), running in Kubernetes. >Reporter: Dieter De Paepe >Priority: Major > > I am testing HBase backup functionality, and noticed following warning when > running "hbase backup create incremental ...": > > {noformat} > 23/09/13 15:44:10 WARN org.apache.hadoop.hbase.backup.util.BackupUtils: Skip > log file (can't parse): > hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312{noformat} > It appears in my setup, the oldWALs are indeed given names that seem to break > "ServerName.valueOf(s)" in "BackupUtils#parseHostFromOldLog(Path p)": > > > {noformat} > user@hadoop-client-769bc9946-xqrt2:/$ hdfs dfs -ls hdfs:///hbase/hbase/oldWALs > Found 42 items > -rw-r--r-- 1 hbase hbase 775421 2023-09-13 13:14 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694609957984$masterlocalwal$ > -rw-r--r-- 1 hbase hbase 26059 2023-09-13 13:29 > hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694610867894$masterlocalwal$ > ... > -rw-r--r-- 1 hbase hbase 242479 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312 > -rw-r--r-- 1 hbase hbase 4364 2023-09-13 14:16 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694610188654 > ... > -rw-r--r-- 1 hbase hbase 70802 2023-09-13 13:15 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694609970025.meta > -rw-r--r-- 1 hbase hbase 93 2023-09-13 13:04 > hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694610188627.meta > ...{noformat} > I'd say this is not a bug in the backup system, but rather in whatever gives > the oldWAL files its name. I'm however not that familiar with HBase code to > find where these files are created. Any pointers are appreciated. > Given that this causes some logs to be missed during backup, I guess this can > lead to data loss in a backup restore? > -- This message was sent by Atlassian Jira (v8.20.10#820010)