[ https://issues.apache.org/jira/browse/HBASE-16649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15524530#comment-15524530 ]
Hudson commented on HBASE-16649: -------------------------------- FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1679 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1679/]) HBASE-16649 Truncate table with splits preserved can cause both data (matteo.bertozzi: rev f06c0060aa13a2b5b18edeb66b7479bdd3c6fdc8) * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestTruncateTableProcedure.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteTableProcedure.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java > Truncate table with splits preserved can cause both data loss and truncated > data appeared again > ----------------------------------------------------------------------------------------------- > > Key: HBASE-16649 > URL: https://issues.apache.org/jira/browse/HBASE-16649 > Project: HBase > Issue Type: Bug > Affects Versions: 1.1.3 > Reporter: Allan Yang > Assignee: Matteo Bertozzi > Fix For: 2.0.0, 1.3.0, 1.1.7, 0.98.23, 1.2.4 > > Attachments: HBASE-16649-v0.patch, HBASE-16649-v1.patch, > HBASE-16649-v2.patch > > > Since truncate table with splits preserved will delete hfiles and use the > previous regioninfo. It can cause odd behaviors > - Case 1: *Data appeared after truncate* > reproduce procedureļ¼ > 1. create a table, let's say 'test' > 2. write data to 'test', make sure memstore of 'test' is not empty > 3. truncate 'test' with splits preserved > 4. kill the regionserver hosting the region(s) of 'test' > 5. start the regionserver, now it is the time to witness the miracle! the > truncated data appeared in table 'test' > - Case 2: *Data loss* > reproduce procedure: > 1. create a table, let's say 'test' > 2. write some data to 'test', no matter how many > 3. truncate 'test' with splits preserved > 4. restart the regionserver to reset the seqid > 5. write some data, but less than 2 since we don't want the seqid to run over > the one in 2 > 6. kill the regionserver hosting the region(s) of 'test' > 7. restart the regionserver. Congratulations! the data writen in 4 is now all > lost > *Why?* > for case 1 > Since preserve splits in truncate table procedure will not change the > regioninfo, when log replay happens, the 'unflushed' data will restore back > to the region > for case 2 > since the flushedSequenceIdByRegion are stored in Master in a map with the > region's encodedName. Although the table is truncated, the region's name is > not changed since we chose to preserve the splits. So after truncate the > table, the region's sequenceid is reset in the regionserver, but not reset in > master. When flush comes and report to master, master will reject the update > of sequenceid since the new one is smaller than the old one. The same happens > in log replay, all the edits writen in 4 will be skipped since they have a > smaller seqid -- This message was sent by Atlassian JIRA (v6.3.4#6332)