[ https://issues.apache.org/jira/browse/ACCUMULO-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Keith Turner resolved ACCUMULO-444. ----------------------------------- Resolution: Fixed > Data loss possible when tablet killed immediately after recovery > ---------------------------------------------------------------- > > Key: ACCUMULO-444 > URL: https://issues.apache.org/jira/browse/ACCUMULO-444 > Project: Accumulo > Issue Type: Bug > Components: tserver > Affects Versions: 1.3.5 > Environment: Running random walk, continuous ingest, and agitator on > 10 node cluster. > Reporter: Keith Turner > Assignee: Keith Turner > Priority: Blocker > Labels: 14_qa_bug > Fix For: 1.4.0, 1.3.6 > > > Came in after a weekend of running test to find the Shard random walk test > had lost data in its index table. After debugging I found the following > sequence of events occurred. > * Mutation X was written to shard index on Tablet T1 > * X was minor compacted to file F1 > * Tablet server serving T1 was killed > * When T1 came up on another tablet server, it did not know about F1 > The above sequence of events indicate that the !METADATA table lost data. So > I started looking into that, and found the following sequence of events. > * Tablet server T1 serving METADATA tablet MT was killed > * MT comes up on another tablet server T2 > * Mutation Y is written to MT about file F1 for tablet T1 > * Tablet server T2 is killed. > * MT comes up in tablet server T3 > * The mutations for MT from T1 are recovered, but not from T2.. therefore Y > is lost > There is code that supposed to handle this situation, but its not working... > I think this issue exist in 1.3 > Data loss is not certain in this situation. In the scenario above, when MT > is loaded on T2 a minor compaction is started. If the server is killed > before this minor compaction completes then data loss will likely occur. > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira