[ https://issues.apache.org/jira/browse/HBASE-16423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639905#comment-16639905 ]
Lei Chen edited comment on HBASE-16423 at 10/5/18 2:50 PM: ----------------------------------------------------------- I'm facing the false positive inconsistency problem you described here as well. Having the thread sleep and compare again some time later looks like a good way to reduce noises, but may not be a guaranteed way to report inconsistency. As long as the ingestion is running, it is possible at the time of re-comparing, the target row of source and replication have matched and diverged again. A more sophisticated method may be required if user needs 100% confidence. was (Author: leochen4891): I'm facing the false positive inconsistency problem you described here. Having the thread sleep and compare again some time later looks like a good way to reduce noises, but may not be a guaranteed way to report inconsistency. As long as the ingestion is running, it is possible at the time of re-comparing, the target row of source and replication have matched and diverged again. A more sophisticated method may be required if user needs 100% confidence. > Add re-compare option to VerifyReplication to avoid occasional inconsistent > rows > -------------------------------------------------------------------------------- > > Key: HBASE-16423 > URL: https://issues.apache.org/jira/browse/HBASE-16423 > Project: HBase > Issue Type: Improvement > Components: Replication > Affects Versions: 2.0.0 > Reporter: Jianwei Cui > Assignee: Jianwei Cui > Priority: Minor > Fix For: 1.4.0, 2.0.0 > > Attachments: HBASE-16423-branch-1-v1.patch, HBASE-16423-v1.patch, > HBASE-16423-v2.patch, HBASE-16423-v3.patch > > > Because replication keeps eventually consistency, VerifyReplication may > report inconsistent rows if there are data being written to source or peer > clusters during scanning. These occasionally inconsistent rows will have the > same data if we do the comparison again after a short period. It is not easy > to find the really inconsistent rows if VerifyReplication report a large > number of such occasionally inconsistency. To avoid this case, we can add an > option to make VerifyReplication read out the inconsistent rows again after > sleeping a few seconds and re-compare the rows during scanning. This behavior > follows the eventually consistency of hbase's replication. Suggestions and > discussions are welcomed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)