First of all, thanks for the reply! I appreciate the time taken addressing our 
issues.

> It seems the mentioned "hiccup" caused RS(es) crash(es), as you got RITs and 
> recovered edits under these regions dirs.

To give more context, I was making changes to increase snapshot timeout on 
region servers and did a graceful restart, so I didn't mean to crash anything, 
but it seems like I did this to too many region servers at once (did about half 
the cluster) which seemed to result in some number of regions getting stuck in 
transition. This was attempted on a live production cluster so the hope was to 
do this without downtime but it resulted in an outage to our application 
instead. Unfortunately master and region server logs have since rolled and aged 
out so I don't have them anymore.

> The fact there was a "recovered" dir under some regions dirs means that when 
> the snapshot was taken, crashed RS(es) WAL(s) had been split, but not 
> completely replayed yet.

Snapshot was taken many days later. File timestamps under recovered.edits 
directory were from June 6th and snapshot from the pastebin was taken on June 
14th, but actually snapshots were taken many times with the same result (ETL 
jobs are launched at least daily in oozie). Do you mean that if a snapshot was 
taken before region was fully recovered it could result in this state even if 
snapshot was subsequently deleted?

> Would you know which specific hbase version is this?

It is EMR 5.22 which runs HBase 1.4.9 (with some Amazon-specific edits maybe? I 
noticed line numbers in HRegion.java in stack trace don't quite line up with 
those in the 1.4.9 tag in github).

> Could your job restore the snapshot into a temp table and then read from this 
> temp table using TableInputFormat, instead?

Maybe we could do this, but it will take us some effort to make the changes, 
test, release, etc... Of course we'd rather not jump through hoops like this.

> In this case, it's finding "recovered" folder under regions dir, so it will 
> replay the edits there. Looks like a problem with TableSnapshotInputFormat, 
> seems weird that it tries to delete edits on a non-staging dir (your path 
> suggests it's trying to delete the actual edit folder), that could cause data 
> loss if it would succeed to delete edits before RSes actually replay it.

I agree that this "seems weird" to me given that I am not intimately familiar 
with all of the inner workings of hbase code. The potential data loss is what 
I'm wondering about - would data loss have occurred if we happened to execute 
our job under a user that had delete permissions in HDFS directories? Or did 
the edits actually get replayed when regions were in stuck and transition and 
the files just didn't get cleaned up? Is this something for which I should file 
a defect in JIRA?

Thanks again,

--Jacob LeBlanc


-----Original Message-----
From: Wellington Chevreuil [mailto:wellington.chevre...@gmail.com] 
Sent: Monday, June 17, 2019 3:55 PM
To: user@hbase.apache.org
Subject: Re: TableSnapshotInputFormat failing to delete files under 
recovered.edits

It seems the mentioned "hiccup" caused RS(es) crash(es), as you got RITs and 
recovered edits under these regions dirs. The fact there was a "recovered" dir 
under some regions dirs means that when the snapshot was taken, crashed RS(es) 
WAL(s) had been split, but not completely replayed yet.

Since you are facing error when reading from table snapshot, and the stack 
trace shows TableSnapshotInputFormat is using "HRegion.openHRegion" code path 
to read snapshotted data, it will basically do the same as an RS would when 
trying to assign a region. In this case, it's finding "recovered"
folder under regions dir, so it will replay the edits there. Looks like a 
problem with TableSnapshotInputFormat, seems weird that it tries to delete 
edits on a non-staging dir (your path suggests it's trying to delete the actual 
edit folder), that could cause data loss if it would succeed to delete edits 
before RSes actually replay it. Would you know which specific hbase version is 
this? Could your job restore the snapshot into a temp table and then read from 
this temp table using TableInputFormat, instead?

Em seg, 17 de jun de 2019 às 17:22, Jacob LeBlanc < 
jacob.lebl...@microfocus.com> escreveu:

> Hi,
>
> We periodically execute Spark jobs to run ETL from some of our HBase 
> tables to another data repository. The Spark jobs read data by taking 
> a snapshot and then using the TableSnapshotInputFormat class. Lately 
> we've been having some failures because when the jobs try to read the 
> data, it is trying to delete files under the recovered.edits directory 
> for some regions and the user under which we run the jobs doesn't have 
> permissions to do that. Pastebin of the error and stack trace from one 
> of our job logs is
> here: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__pastebin.com_MAhV
> c9JB&d=DwIFaQ&c=C5b8zRQO1miGmBeVZ2LFWg&r=-G7ASEzkT0cM96gyWHqBYm_tv-Vl8
> sWyppvdo1zs_bg&m=FVOQFa9mNyURuYCEsxwgOABlbQ6Exqq8uj-miVRzIlo&s=yw1IpbL
> 4ALgFshBkYBmCNskREIo_RYDvLhjWd-dJ0yU&e=
>
> This has started happening since upgrading to EMR 5.22 where the 
> recovered.edits directory is collocated with the WALs in HDFS where it 
> used to be in S3-backed EMRFS.
>
> I have two questions regarding this:
>
>
> 1)      First of why are these files under the recovered.edits directory?
> The timestamp of the files coincides with a hiccup we had with our 
> cluster where I had to use "hbase hbck -fixAssignments" to fix regions 
> that were stuck in transition. But that command seemed to work just 
> fine and all regions were assigned and there have since been no 
> inconsistencies. Does this mean the WALs were not replayed correctly? 
> Does "hbase hbck -fixAssignments" not recover regions properly?
>
> 2)      Why is our job trying to delete these files? I don't know enough
> to say for sure, but it seems like using TableSnapshotInputFormat to 
> read snapshot data should not be trying recover or delete edits.
>
> I've fixed the problems by running "assign '<region>'" in hbase shell 
> for every region that had files under the recovered.edits directory 
> and those files seemed to be cleaned up when the assignment completed. 
> But I'd like to understand this better especially if something is 
> interfering with replaying edits from WALs (also making sure our ETL 
> jobs don't start failing would be nice).
>
> Thanks!
>
> --Jacob LeBlanc
>
>

Reply via email to