keith-turner opened a new issue, #5969:
URL: https://github.com/apache/accumulo/issues/5969

   **Describe the bug**
   
   When a lot of tservers die this [code in the 
manager](https://github.com/apache/accumulo/blob/c92c2246a7bb5d0c848175448e54e8573bc3a087/server/manager/src/main/java/org/apache/accumulo/manager/TabletGroupWatcher.java#L324)
 does a lot of redundant work per tablet for write ahead logs.   The same write 
ahead log can be referenced by many tablets.
   
   These per tablet computations related to walogs can significantly slow down 
the managers scan of the metadata table.
   
   **To Reproduce**
   
   Create a table with lots of tablets and then kill the tservers hosting it.  
Observe where the manager spends its time during recovery.
   
   **Expected behavior**
   
   For each complete scan of the metadata table the manager should remember 
what is needed for each write ahead log.  So the first time it sees a write 
ahead log in a tablet it does the computations and remembers it for subsequent 
tablets.  This would avoid doing the same computation for many tablets that 
reference the same walog.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to