[ https://issues.apache.org/jira/browse/HBASE-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yunfan Zhong updated HBASE-10464: --------------------------------- Resolution: Fixed Status: Resolved (was: Patch Available) > Race condition during RS shutdown that could cause data loss > ------------------------------------------------------------ > > Key: HBASE-10464 > URL: https://issues.apache.org/jira/browse/HBASE-10464 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.89-fb > Reporter: Yunfan Zhong > Priority: Critical > Fix For: 0.89-fb > > Attachments: D1120497.diff > > > Bug scenario (T* are timestamps, say T1 < T2 < ... < Tn): > 1. Master assigns a region to RS at T1 > 2. RS works on opening the region during T1 to T3 > 3. In the mean time of opening the region, RS starts to shut down at T2, and > dfs client is closed at T5. > 4. Regions owned by the RS get closed as a step of RS shutdown except that > the newly opened region is online during T3 to T5 and holds some mutations in > memory after possible last flush T4. > 5. Since master thinks RS has a clean shutdown, there is no log splitting. > The HLog was moved to old logs directory naturally. > 6. Mutations in memory between T4 to T5 (if T4 does not exist, T3 to T5) are > not flushed. They only exist in WAL if it is turned on. > Fix is to prevent region opening from succeeding when the RS is shutting down. -- This message was sent by Atlassian JIRA (v6.1.5#6160)