Hello,

In a couple of situations we were noticing some odd problems with old data
appearing in the application, and I finally found a reproducible scenario.
Here's what we're seeing in one basic case:

1. Using a scan in hbase shell one of our column cells (both the column
name and value are simple long's) looks like so:

column=thing:\x00\x00\x00\x00\x00\x00\x00\x02, timestamp=1332795701976,
value=\x00\x00\x00\x00\x00\x00\x00s

2. If we then use a "Put" to update that cell to a new value it looks as
we'd expect like so:

column=thing:\x00\x00\x00\x00\x00\x00\x00\x02, timestamp=1332866682295,
value=\x00\x00\x00\x00\x00\x00\x00u

3. If we then use a "Delete" to remove that column, instead of the column
no longer being included in the scan we instead see the following again:

column=thing:\x00\x00\x00\x00\x00\x00\x00\x02, timestamp=1332795701976,
value=\x00\x00\x00\x00\x00\x00\x00s

So, for some reason, at least in this case, the tombstone/delete marker
doesn't appear to be preventing new scans from no longer seeing the old
data.

Note that this is a small development cluster of HBase (version:
hbase-0.90.4-cdh3u2) which contains one master and three region servers,
and I have confirmed that the clocks are synchronized properly between the
four machines.  Also note that we're using the Java client API to run the
Put/Delete commands noted above.

Any ideas on how old data could still appear in a Get/Scan like this, and
if there are any workarounds we could try?  I saw HBASE-4536, but after
reading that thread it didn't seem pertinent to this more basic scenario.

Thanks in advance for any pointers!

      -Shawn

Reply via email to