[jira] [Created] (HBASE-28903) Incremental backup test missing explicit test for bulkloads
Hernan Gelaf-Romer created HBASE-28903: -- Summary: Incremental backup test missing explicit test for bulkloads Key: HBASE-28903 URL: https://issues.apache.org/jira/browse/HBASE-28903 Project: HBase Issue Type: Improvement Reporter: Hernan Gelaf-Romer Our incremental backup tests don't explicitly test our ability to backup and restore bulkloads. It'd be nice to have this to verify bulkloads work in the context of the backup/restore flow, and to avoid regressions in the future -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28897) Incremental backups can be taken with incompatible column families
Hernan Gelaf-Romer created HBASE-28897: -- Summary: Incremental backups can be taken with incompatible column families Key: HBASE-28897 URL: https://issues.apache.org/jira/browse/HBASE-28897 Project: HBase Issue Type: Improvement Components: backup&restore Reporter: Hernan Gelaf-Romer Incremental backups can be taken even if the table descriptor of the current table does not match the column families of the full backup for that same table. When restoring the table, we choose to use the families of the full backup. This can cause the restore process to fail if we add a column family in the incremental backup that doesn't exist in the full backup. The bulkload process will fail because it is trying to write column families that don't exist in the restore table. I think the correct solution here is to prevent incremental backups from being taken if the families of the current table don't match those of the full backup. This will force users to instead take a full backup. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-28484) Allow replicating to different table
Hernan Gelaf-Romer created HBASE-28484: -- Summary: Allow replicating to different table Key: HBASE-28484 URL: https://issues.apache.org/jira/browse/HBASE-28484 Project: HBase Issue Type: Improvement Components: Replication Reporter: Hernan Gelaf-Romer Assignee: Hernan Gelaf-Romer At the moment, HBase replication assumes the source and target table have the same name. Our company is looking into a use case that would require replicating to an arbitrary target table. *Potential Solution:* It seems that the WALKey stores the table name of it's corresponding edit should be applied to. We can add a setting to ReplicationPeerConfig that allows users to specify a source to target table mapping. This mapping can then be used to overwrite the WALKey's table before shipping the edit to the target cluster. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27641) Verify replication excessive false positive bad rows
Hernan Gelaf-Romer created HBASE-27641: -- Summary: Verify replication excessive false positive bad rows Key: HBASE-27641 URL: https://issues.apache.org/jira/browse/HBASE-27641 Project: HBase Issue Type: Improvement Components: mapreduce, Replication Reporter: Hernan Gelaf-Romer Assignee: Hernan Gelaf-Romer Verify replication can generate a lot of `BADROWS` results when comparing a row that may be particularly hot at the time of re-compare. This can lead to a mismatch between the source and sink result if due to replication lag. We could add some configurable re-compare mechanism that will make verify replication less susceptible to falsely reporting `BADROWS` when under significant write load. These re-compares can be done asynchronously so as to not significantly slow down the execution time of the job. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27364) Intra-cluster replication sink metrics
Hernan Gelaf-Romer created HBASE-27364: -- Summary: Intra-cluster replication sink metrics Key: HBASE-27364 URL: https://issues.apache.org/jira/browse/HBASE-27364 Project: HBase Issue Type: Improvement Components: metrics, read replicas Reporter: Hernan Gelaf-Romer Region replication doesn't emit any sink metrics at the moment, these would be useful in determining replication lag. Adding metrics such as ageOfLastAppliedOp would be helpful. -- This message was sent by Atlassian Jira (v8.20.10#820010)