ctubbsii commented on code in PR #2792:
URL: https://github.com/apache/accumulo/pull/2792#discussion_r977751451


##########
server/gc/src/main/java/org/apache/accumulo/gc/GarbageCollectionAlgorithm.java:
##########
@@ -216,6 +224,46 @@ private long 
removeBlipCandidates(GarbageCollectionEnvironment gce,
     return blipCount;
   }
 
+  @VisibleForTesting
+  /**
+   * Double check no tables were missed during GC
+   */
+  protected void ensureAllTablesChecked(Set<TableId> tableIdsBefore, 
Set<TableId> tableIdsSeen,
+      Set<TableId> tableIdsAfter) {
+
+    // if a table was added or deleted during this run, it is acceptable to not
+    // have seen those tables ids when scanning the metadata table. So get the 
intersection
+    final Set<TableId> tableIdsMustHaveSeen = new HashSet<>(tableIdsBefore);
+    tableIdsMustHaveSeen.retainAll(tableIdsAfter);
+
+    if (tableIdsMustHaveSeen.isEmpty() && !tableIdsSeen.isEmpty()) {
+      throw new RuntimeException("Garbage collection will not proceed because "
+          + "table ids were seen in the metadata table and none were seen 
Zookeeper. "
+          + "This can have two causes. First, total number of tables going 
to/from "
+          + "zero during a GC cycle will cause this. Second, it could be 
caused by "
+          + "corruption of the metadata table and/or Zookeeper. Only the 
second cause "
+          + "is problematic, but there is no way to distinguish between the 
two causes "
+          + "so this GC cycle will not proceed. The first cause should be 
transient "
+          + "and one would not expect to see this message repeated in 
subsequent GC cycles.");
+    }

Review Comment:
   While creating the first table might not trigger this because of the lack of 
gc candidate files, it's still true that users are likely creating tables soon 
after initialization, and my point is that going from zero (user) tables to 
non-zero is more likely to occur soon after initialization than at other times, 
even if that initial activity would need to be more involved in order to 
trigger this (such as creating a test table, flushing, deleting, then creating 
the first table that the first GC run sees).
   
   However, if this doesn't crash the GC, but merely triggers the current cycle 
to be skipped, then my concern goes away.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to