I've been struggling with errors on the region moving tests on my HBase 3.0
WIP branch and have finally tracked the problems down to Phoenix's dummy
Cells (as well as some built-in assumptions in Phoenix which are not true
for Hbase 3, see PHOENIX-7728
<https://issues.apache.org/jira/browse/PHOENIX-7728>)

HBase is not aware that these are dummy cells, and is considering the rows
as already processed when retrying scans after the region goes away from
under the scan, i.e. it restarts the scan from AFTER the dummy cell's
rowkey, leading to the scan skipping rows.

I have been able to fix the tests by hacking Hbase to ignore these dummy
cells (and fixing the phoenix side problems described in PHOENIX-7728
<https://issues.apache.org/jira/browse/PHOENIX-7728>), but I don't think
that hacking HBase to work with dummy cells is the way to go (or even if
that would be accepted by HBase).

AFAIU the dummy cells were added back in the HBase 1.x when there was no
other way to ensure timely responses from the server.

HBase 2 has introduced the keepalive/cursor mechanics, which IUC serves the
exact same purpose at the Phoenix dummy cells.

I propose dropping the dummy cell mechanics from Phoenix, and using the
HBase keepalive/cursor mechanics instead (we may not even need the cursors).

If we cannot find a better way to shortcut some processing in Phoenix we
may need to keep dummy cells internally, but we have to make sure that they
never appear on the wire and reach the client. (i.e. in that case we'd need
to check and convert to a heartbeat scan result somehow)

We will also need to consider backwards compatibility.

Is Hbase 2/3 wire compatible enough that connecting with HBase 2.x clients
to Hbase 3 even a possibility ?

Do we want to support that ?

When using Hbase 2.x, if Phoenix starts to use the HBase keepalive
mechanics, will old clients work with that without changes, or do we need
to keep sending Dummy cells for older clients ?

Looking forward to hearing your take,

Istvan

Reply via email to