Hello,

I would like to bring PHOENIX-7106
<https://issues.apache.org/jira/browse/PHOENIX-7106> to everyone's
attention here and brief about the data integrity issues that we have in
various coprocessors. Majority of the issues are related to the fact that
we do not return valid rowkey for certain queries. If any region moves in
the middle of the scan, the HBase client relies on the last returned rowkey
and accordingly changes the scan boundaries while the scanner is getting
reset to continue the scan operation. If the region does not move, scan is
not expected to return invalid data, however if the region moves in the
middle of ongoing scan operation, scan would return invalid/incorrect data
causing data integrity issues.

Given the critical nature of these issues, I would like to propose that we
treat this as a high priority for the upcoming 5.2.0 release, and not
include any other feature or big change to master branch until we merge
this. The PR is still not ready as additional changes are still in my
local, requiring rebase with the current master.

I would get back to this discuss thread as soon as the PR and the doc are
updated with the latest findings so far. The changes include many of our
coproc scanner implementations and hence it would require significant
review as well.
It would be great if we can hold on to merging any feature or big change to
master branch until this gets in so as to not complicate merging/rebasing.
Once this is merged to the master branch, I would like to cut 5.2 branch
from master and we can move forward with 5.2.0 release.

Please let me know if this looks good or if you have any other high
priority work for 5.2.0.

Reply via email to