[ 
https://issues.apache.org/jira/browse/HBASE-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingcheng Du updated HBASE-16841:
---------------------------------
    Description: 
Running the following steps will probably lose MOB data when working with 
snapshots.
1. Create a mob-enabled table by running {{create 't1', {NAME => 'f1', IS_MOB 
=> true, MOB_THRESHOLD => 0}}}.
2. Put millions of data.
3. Run {{snapshot 't1','t1_snapshot'}} to take a snapshot for this table t1.
4. Run {{clone_snapshot 't1_snapshot','t1_cloned'}} to clone this snapshot.
5. Run {{delete_snapshot 't1_snapshot'}} to delete this snapshot.
6. Run {{disable 't1'}} and {{delete 't1'}} to delete the table.
7. Now go to the archive directory of t1, the number of .link directories is 
different from the number of hfiles which means some data will be lost after 
the hfile cleaner runs.

This is because, when taking a snapshot on a enabled mob table, each region 
flushes itself and takes a snapshot, and the mob snapshot is taken only if the 
current region is first region of the table. At that time, the flushing of some 
regions might not be finished, and some mob files are not flushed to disk yet. 
Eventually some mob files are not recorded in the snapshot manifest.
To solve this, we need to take the mob snapshot at last after the snapshots on 
all the online and offline regions are finished in 
{{EnabledTableSnapshotHandler}}.

  was:
When taking a snapshot on a enabled mob table, each region flushes itself and 
takes a snapshot, and the mob snapshot is taken only if the current region is 
first region of the table. At that time, the flushing of some regions might not 
be finished, and some mob files are not flushed to disk yet. Eventually some 
mob files are not recorded in the snapshot manifest.
To solve this, we need to take the mob snapshot at last after the snapshots on 
all the online and offline regions are finished in 
{{EnabledTableSnapshotHandler}}.


> Data loss in MOB files after cloning a snapshot and deleting that snapshot
> --------------------------------------------------------------------------
>
>                 Key: HBASE-16841
>                 URL: https://issues.apache.org/jira/browse/HBASE-16841
>             Project: HBase
>          Issue Type: Bug
>          Components: mob, snapshots
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>         Attachments: HBASE-16841.patch
>
>
> Running the following steps will probably lose MOB data when working with 
> snapshots.
> 1. Create a mob-enabled table by running {{create 't1', {NAME => 'f1', IS_MOB 
> => true, MOB_THRESHOLD => 0}}}.
> 2. Put millions of data.
> 3. Run {{snapshot 't1','t1_snapshot'}} to take a snapshot for this table t1.
> 4. Run {{clone_snapshot 't1_snapshot','t1_cloned'}} to clone this snapshot.
> 5. Run {{delete_snapshot 't1_snapshot'}} to delete this snapshot.
> 6. Run {{disable 't1'}} and {{delete 't1'}} to delete the table.
> 7. Now go to the archive directory of t1, the number of .link directories is 
> different from the number of hfiles which means some data will be lost after 
> the hfile cleaner runs.
> This is because, when taking a snapshot on a enabled mob table, each region 
> flushes itself and takes a snapshot, and the mob snapshot is taken only if 
> the current region is first region of the table. At that time, the flushing 
> of some regions might not be finished, and some mob files are not flushed to 
> disk yet. Eventually some mob files are not recorded in the snapshot manifest.
> To solve this, we need to take the mob snapshot at last after the snapshots 
> on all the online and offline regions are finished in 
> {{EnabledTableSnapshotHandler}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to