cshannon commented on issue #608:
URL: https://github.com/apache/accumulo/issues/608#issuecomment-1399316343

   I investigated this again yesterday and there are still some reports of this 
same behavior happening against 1.10.x. It's similar behavior where there is 
missing data returned from the scan of the accumulo.metdata table during GC 
leading to references being missed. In this case an entire tablet worth of data 
is missing which leads to files being deleted that shouldn't because the file 
references aren't seen.
   
   A simple example of the scenario seen is:
   
   1. accumulo.root has 1 tablet containing the metadata for 4 
accumulo.metadata tablets.
   2. The 3rd accumulo metadata tablet contains metadata for the 10th, 11th, 
and 12th tablets in table X.
   3. GC runs and all of the metadata contained in the 1st, 2nd, and 4th 
accumulo.metadata tablets is returned and used appropriately by the garbage 
collector
   4. However, during the GC run all of the data for the 3rd accumulo metadata 
tablet goes missing which leads to missing the references to files in the 10th, 
11th, and 12th tablets from table X.
   5. Because the file references are missing, files for the 10th, 11th, 12th 
tablets of table X get removed 
   
   In some instances the issue could be traced to a hardware failure (disk 
issue) but this has not always been the case. Even if there is a hardware 
failure the fact that data is missing should be detected during the safety 
checks of the scan for file references of the metadata table. If there are 
errors/inconsistencies detected during the scan (due to splits, etc) then the 
iterator that is used should reset itself to prevent an issue.
   
   However, because there ends up being file references missing and bypassing 
the checks this would seem to maybe indicate an issue in the iterator error 
handling/detection so maybe something isn't quite right because GC continued 
with data not being detected as missing. The iterator used for scanning for 
file references is the 
[TabletIterator](https://github.com/apache/accumulo/blob/88b75633f7eef883f07446cfbff01cc193ecf6fa/server/base/src/main/java/org/apache/accumulo/server/util/TabletIterator.java#L57)
 used in version 1.10. It's not known if the issue still exists in 2.1 as there 
were changes made and a different Iterator is now used but if it does it would 
likely be in the equivalent iterator used for the references scan in the 
[LinkingIterator](https://github.com/apache/accumulo/blob/bd49cff1be7de00a43cd9f03e75c319e5ab3735d/core/src/main/java/org/apache/accumulo/core/metadata/schema/LinkingIterator.java#L50)
 in 2.1 .
   
   The root cause of the issue isn't known so it makes it hard to fix or find a 
solution for now but I talked to @dlmarion about this and thought of a few 
things to bring up and discuss.
   
   1. Does it make sense to just abort the GC run vs trying to reset the 
iterator if an error is detected? The gc process runs in a loop anyways and 
will just re-run the next iteration. It's possible there's an issue with 
resetting the iterator so maybe just aborting would be better.
   2. Is it possible there is actually a server side error that isn't bubbling 
back to the client which would lead to the client continuing the scan when it 
shouldn't?
   3. Should we try and run an extra sanity check somehow to verify the data is 
actually correct? For example, should we try also using the Offline iterator to 
scan and compare? Can we verify we actually received all the tablets we should 
by checking before and after the run? Do we run the scan multiple times and 
compare? If we ran multiple times we could take a superset of files seen 
(anything that truly should be GC'd would just get removed next run)
   4. Because an entire tablet of data was missing in this scenario is it 
possible it's related to the tablet splitting and the data was missed?
   5. Does this error actually happen in other cases (scans of other/normal 
tables) but no one notices usually because missing data during the scan may not 
be noticed but in this case it is noticed because it leads to errors later? Or 
is this only related to the metadata table and these specific iterators?
   
   Obviously some more investigating work needs to be done but this is where 
things currently stand after I started looking into it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to