[ https://issues.apache.org/jira/browse/HBASE-10466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yunfan Zhong updated HBASE-10466: --------------------------------- Summary: Bugs that cause flushes being skipped during HRegion close could cause data loss (was: Bugs that causes flushes being skipped during HRegion close could cause data loss) > Bugs that cause flushes being skipped during HRegion close could cause data > loss > -------------------------------------------------------------------------------- > > Key: HBASE-10466 > URL: https://issues.apache.org/jira/browse/HBASE-10466 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.89-fb > Reporter: Yunfan Zhong > Priority: Critical > Fix For: 0.89-fb > > Attachments: > Fix-bugs-that-causes-flushes-being-skipped-during-re.patch > > > During region close, there are two flushes to ensure nothing is persisted in > memory. When there is data in current memstore only, 1 flush is required. > When there is data also in memstore's snapshot, 2 flushes are essential > otherwise we have data loss. However, recently we found two bugs that lead to > at least 1 flush skipped and caused data loss. > Bug 1: Wrong calculation of HRegion.memstoreSize > When a flush fails, data to be flushed is kept in each MemStore's snapshot > and wait for next flush attempt to continue on it. But when the next flush > succeeds, the counter of total memstore size in HRegion is always deduced by > the sum of current memstore sizes instead of snapshots left from previous > failed flush. This calculation is problematic that almost every time there is > failed flush, HRegion.memstoreSize gets reduced by a wrong value. If region > flush could not proceed for a couple cycles, the size in current memstore > could be much larger than the snapshot. It's likely to drift memstoreSize > much smaller than expected. In extreme case, if the error accumulates to even > bigger than HRegion's memstore size limit, any further flush is skipped > because flush does not do anything if memstoreSize is not larger than 0. > When the region is closing, if the two flushes get skipped and leave data in > current memstore and/or snapshot, we could lose data up to the memstore size > limit of the region. > The fix is deducing correct size of data that is going to be flushed from > memstoreSize. > Bug 2: Conditions for the first flush of region close (so-called pre-flush) > If memstoreSize is smaller than a certain value, or when region close starts > a flush is ongoing, the first flush is skipped and only the second flush > takes place. However, two flushes are required in case previous flush fails > and leaves some data in snapshot. The bug could cause loss of data in current > memstore. > The fix is removing all conditions except abort check so we ensure 2 flushes > for region close. -- This message was sent by Atlassian JIRA (v6.1.5#6160)