[ 
https://issues.apache.org/jira/browse/HBASE-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunfan Zhong updated HBASE-10464:
---------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

> Race condition during RS shutdown that could cause data loss
> ------------------------------------------------------------
>
>                 Key: HBASE-10464
>                 URL: https://issues.apache.org/jira/browse/HBASE-10464
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.89-fb
>            Reporter: Yunfan Zhong
>            Priority: Critical
>             Fix For: 0.89-fb
>
>         Attachments: D1120497.diff
>
>
> Bug scenario (T* are timestamps, say T1 < T2 < ... < Tn):
> 1. Master assigns a region to RS at T1
> 2. RS works on opening the region during T1 to T3
> 3. In the mean time of opening the region, RS starts to shut down at T2, and 
> dfs client is closed at T5.
> 4. Regions owned by the RS get closed as a step of RS shutdown except that 
> the newly opened region is online during T3 to T5 and holds some mutations in 
> memory after possible last flush T4.
> 5. Since master thinks RS has a clean shutdown, there is no log splitting. 
> The HLog was moved to old logs directory naturally.
> 6. Mutations in memory between T4 to T5 (if T4 does not exist, T3 to T5) are 
> not flushed. They only exist in WAL if it is turned on.
> Fix is to prevent region opening from succeeding when the RS is shutting down.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to