Correct, no records will ever be updated.

We do have a custom coprocessor loaded but before I mention that you should have more context about what we're actually doing. We're running GeoMesa [1] on top of HBase. Practically what this means is that for every record we write, we write twice, once to a spatio-temporal indexed table and another to an attribute indexed table. What we have seen is that the value in one of the tables, but not the other, is becoming corrupt. As far as we can tell, no custom read path code is involved since we've validated the raw binary values using direct HBase API access and since the write path has been verified with a second bulk load. Additionally, the fact the same code is writing the values for each index, I feel confident ruling out the write path. As I understand it so far the only manipulation of the values is happening during compactions.

The coprocessor we're using is available here [2]. It's not doing anything too crazy, just filtering and depending on the query type, deserializing the row values and/or rolling up some aggregations.

Thanks,
Austin

[1] https://www.geomesa.org/
[2] https://github.com/locationtech/geomesa/blob/master/geomesa-hbase/geomesa-hbase-datastore/src/main/scala/org/locationtech/geomesa/hbase/coprocessor/GeoMesaCoprocessor.scala

On 2019-03-20 21:34, Sean Busbey wrote:
So you're saying no records should ever be updated, right?

Do you have any coprocessors loaded?

On Wed, Mar 20, 2019, 20:32 aheyne <ahe...@ccri.com> wrote:

I don't have the WALs but due to the nature of the data each
record/key
is unique. The keys for the data are generated using
spatial-temporal
dimensions of the observation.

-Austin

On 2019-03-20 21:25, Sean Busbey wrote:
Have you examined the wals for writes to the impacted cells to
verify
an
update wasn't written with the change to the value?

On Wed, Mar 20, 2019, 17:47 Austin Heyne <ahe...@ccri.com> wrote:

Hey all,

We're running HBase 1.4.8 on EMR 5.20 backed by S3 and we're
seeing a
bit get flipped in some record values.

We've preformed a bulk ingest and bulk load of a large chunk of
data
and
then pointed a live ingest feed to that table. After a period of
time
we
found that a few records in the table had been corrupted and were
one
bit different from their original value. Since we saved the
output of
the bulk ingest we re-loaded those files and verified that at the
time
of bulk load the record was correct. This seems to us to indicate
that
at some point during the live ingest writes the record was
corrupted.

I've verified that the region that the record is in has never
been
split
but it has received over 2 million write requests so there very
likely
could have been some minor compactions there.

Has anyone seen anything like this before?

Thanks,
Austin

--
Austin L. Heyne


Reply via email to