Hello, I would like to bring PHOENIX-7106 <https://issues.apache.org/jira/browse/PHOENIX-7106> to everyone's attention here and brief about the data integrity issues that we have in various coprocessors. Majority of the issues are related to the fact that we do not return valid rowkey for certain queries. If any region moves in the middle of the scan, the HBase client relies on the last returned rowkey and accordingly changes the scan boundaries while the scanner is getting reset to continue the scan operation. If the region does not move, scan is not expected to return invalid data, however if the region moves in the middle of ongoing scan operation, scan would return invalid/incorrect data causing data integrity issues.
Given the critical nature of these issues, I would like to propose that we treat this as a high priority for the upcoming 5.2.0 release, and not include any other feature or big change to master branch until we merge this. The PR is still not ready as additional changes are still in my local, requiring rebase with the current master. I would get back to this discuss thread as soon as the PR and the doc are updated with the latest findings so far. The changes include many of our coproc scanner implementations and hence it would require significant review as well. It would be great if we can hold on to merging any feature or big change to master branch until this gets in so as to not complicate merging/rebasing. Once this is merged to the master branch, I would like to cut 5.2 branch from master and we can move forward with 5.2.0 release. Please let me know if this looks good or if you have any other high priority work for 5.2.0.