Github user keith-turner commented on the issue:

    https://github.com/apache/accumulo/pull/224
  
    > @keith-turner we talked about this yesterday, but I wanted to post it 
here. What would happen if a file is deleted, like maybe compacted and gc'd, 
after the file list is grabbed?
    
    @mjwall I had not thought of this case and currently have no handling for 
it.  Yet another win for code reviews.
    
    I think the best solution to this problem is to introduce a new inaccuracy 
counter called `deleted`.  There are already a few inaccuracy counters reported 
when gather summary information.  I will add another comment that shows where 
these can be found.
    
    At first I thought I could circle back and use the file that replaced a 
missing file.  However this approach has a problem.  Multiple deleted files 
could have been compacted into the replacement file, and for some of those 
deleted files we may have already gathered and merged summary information.  
Trying to avoid this problem would make gathering summaries more expensive.  In 
order to keep gathering summaries fast, I think it would be best to just report 
the problem.  If someone really wants to avoid this problem, they can clone the 
table and make the request against the clone.  I can put this avoidance 
strategy in the javadoc for `deleted`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to