matthpeterson opened a new issue, #608:
URL: https://github.com/apache/accumulo/issues/608

   A significant number of Imported files were garbage collected despite the 
presence of "file" entries referencing them.  This impacted only the file 
references within one accumulo.metadata tablet which supported many tablets for 
a table.  
   
   The tablet server hosting the accumulo.metadata tablet died, the tablet was 
reassigned, and the WALs were recovered.  See the logs summary below.
   
   16:04:50 - tserver imports one of the files that will later go missing
   17:28:59 - GC recognizes candidate file is in use
   17:41:25 - GC gc.SimpleGarbageCollector collect cycle statistics reported
   17:42:40 - GC gc.GarbageCollectWriteAheadLogs reports statistics
   17:44:49 - GC Thread gc stuck on IO to accumulo master for at least 2 minutes
   17:46:26 - GC Thread gc no longer stuck
   17:56:06 - datanode dies on tserver hosting accumulo.metadata
      ...
      Tserver reports many errors related to HDFS access, including WAL delete 
failures
      ...
   17:57:14 - datanode startup
   17:57:44 - tserver dies
   
   17:58:45 - master.Master reports tablet assigned to dead tserver
   17:58:49 - master.Master reports old location state
   17:59:20 - master.Master reports old location state
   17:59:20 - master.Master reports assigning tablet
   17:59:27 - master.Master reports new location state
   17:59:28 - master.Master reports new location state
   17:59:29 - master.Master reports new location state
   17:59:26 - tserver: Loading accumulo.metadata tablet
   17:59:31 - tserver: got assignment from master for tablet, loading and 
verifying extent
   17:59:32 - Starting WAL recovery occurs accumulo.metadata tablet
   17:59:32 - WAL recovery complete for accumulo.metadata tablet
   17:59:34 - master.EventCoordinator reports accumulo.metadata tablet loaded
   
   18:06:09 - GC deletes file that was previously recognized as in use
   
   18:10:29 - gc.SimpleGarbageCollector collect cycle statistics reported
   18:12:00 - Errors begin for one of the missing files
   
   This impacted a 1.8.1 fork that contained the 1.9.1 dataloss bug fixes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to