[ 
https://issues.apache.org/jira/browse/ACCUMULO-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Tubbs updated ACCUMULO-575:
---------------------------------------

    Assignee: John Vines
    Reporter: John Vines  (was: jv)
    
> Potential data loss when datanode fails immediately after minor compaction
> --------------------------------------------------------------------------
>
>                 Key: ACCUMULO-575
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-575
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.4.1, 1.4.0
>            Reporter: John Vines
>            Assignee: John Vines
>             Fix For: 1.5.0
>
>
> So this one popped into my head a few days ago, and I've done some research.
> Context-
> 1. In memory map is written to an RFile.
> 2. yadda yadda yadda, FSOutputStream.close() is called.
> 3. close() calls complete() which will not return until the 
> dfs.replication.min is reached. dfs.replication.min is by default set to 1 on 
> systems and I don't think it's frequently configured
> 4. We read the file to make sure that it was written correctly (this has 
> probably been a mitigating factor as to why we haven't run into this 
> potential issue)
> 5. We write the file to the !METADATA table
> 6. We write minor compaction to the walog
> If the datanode goes down after 6 but before the file is replicated more, 
> then we'll have data loss. The file will be known to the namenode as 
> corrupted, but we can't restore it automatically, because the walog has the 
> file complete. Step 4 has probably provided enough of a time buffer to 
> significantly decrease the possibility of this happening.
> I have not explicitly tested this, but I want to test to validate the 
> potential scenario of losing data by dropping a datanode in a multi-node 
> system immediately after closing the FSOutputStream. If this is the case, 
> then we may want to consider adding a wait between steps 4 and 5 that polls 
> the namenode for replication reaching at least the max(2, # nodes).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to