keith-turner opened a new pull request #539: Fix WAL race condition between 
zookeeper and metadata table
URL: https://github.com/apache/accumulo/pull/539
 
 
   I noticed this race condition while looking in to #535.  This seems like a 
very low probability event because multiple conditions would need to be met in 
a small time window for data loss to occur as a result of this race.
   
    * Only tablet A references WAL_X in its otherLogs set
    * Thread T1 one clears otherLogs for tablet A 
    * Thread T2 calls markUnusedWALs() removing WAL_X from zookeeper
    * Tablet server dies (tablet data that was in WAL_X is not in metadata 
table or recovery data)
    * Thread T1 would have updated the metadata table with a new file if 
tserver had not died
   
   This change acquires a lock so that the call to markUnusedWALs() will block 
until the file is added to metadata table.  This lock prevents zookeeper 
updates from happening before metadata table updates.
   
   This bug only affects Accumulo 1.8.0 and later.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to