keith-turner opened a new issue, #5650:
URL: https://github.com/apache/accumulo/issues/5650

   **Describe the bug**
   
   Scan file reference that were written by compactions are removed when the 
last scan finishes using them 
[here](https://github.com/apache/accumulo/blob/022225a4b2bed3b50cdcc0e0dd5f67d6aaa8a1f5/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/DatafileManager.java#L155-L160).
  If the scan thread is interrupted (there have been changes in 2.1 to more 
aggressively interrupt scans that clients abandoned) while doing the metadata 
write to remove these then it could cause those scan references to be left in 
the tablet until its reloaded.  Scan refs are always removed on tablet load.  
These being left around until reload would prevent garbage collection of files 
that are no longer in use.
   
   **Expected behavior**
   
   Scan refs are eventually removed by a tablet this no longer using them.  
   
   Could remove this 
[code](https://github.com/apache/accumulo/blob/022225a4b2bed3b50cdcc0e0dd5f67d6aaa8a1f5/server/tserver/src/main/java/org/apache/accumulo/tserver/tablet/DatafileManager.java#L155-L160)
 and instead have a periodic task in the tserver that looks for inactive scan 
refs and deletes them from the metadata table and if that is successful removes 
them from tserver memory.  This would ensure they always get cleaned up.  
Another nice thing about this change is it would avoid a potential metadata 
tablet write for a scan of a user tablet.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to