I don't have the WALs but due to the nature of the data each record/key is unique. The keys for the data are generated using spatial-temporal dimensions of the observation.

-Austin

On 2019-03-20 21:25, Sean Busbey wrote:
Have you examined the wals for writes to the impacted cells to verify an
update wasn't written with the change to the value?

On Wed, Mar 20, 2019, 17:47 Austin Heyne <ahe...@ccri.com> wrote:

Hey all,

We're running HBase 1.4.8 on EMR 5.20 backed by S3 and we're seeing a
bit get flipped in some record values.

We've preformed a bulk ingest and bulk load of a large chunk of data and then pointed a live ingest feed to that table. After a period of time we
found that a few records in the table had been corrupted and were one
bit different from their original value. Since we saved the output of
the bulk ingest we re-loaded those files and verified that at the time
of bulk load the record was correct. This seems to us to indicate that
at some point during the live ingest writes the record was corrupted.

I've verified that the region that the record is in has never been split
but it has received over 2 million write requests so there very likely
could have been some minor compactions there.

Has anyone seen anything like this before?

Thanks,
Austin

--
Austin L. Heyne


Reply via email to