Copilot commented on code in PR #10363:
URL: https://github.com/apache/ozone/pull/10363#discussion_r3437001682


##########
hadoop-hdds/rocksdb-checkpoint-differ/src/test/java/org/apache/ozone/rocksdiff/TestRocksDBCheckpointDiffer.java:
##########
@@ -1009,33 +1017,60 @@ private static List<ColumnFamilyDescriptor> 
getColumnFamilyDescriptors() {
         .map(ColumnFamilyDescriptor::new).collect(Collectors.toList());
   }
 
+  /**
+   * Resolves column family for an SST id: compaction DAG first, then 
snapshots.
+   *
+   * @return (true, cf) when the SST is known for this run (cf may be null);
+   *     (false, ignored) when the id does not appear (e.g. different SST 
numbering than the hard-coded list).
+   */
+  private Pair<Boolean, String> 
resolveColumnFamilyForDiffFile(RocksDBCheckpointDiffer differ, String diffFile,
+      DifferSnapshotInfo srcSnap, DifferSnapshotInfo destSnap) {
+    CompactionNode node = differ.getCompactionNodeMap().get(diffFile);
+    if (node != null) {
+      return Pair.of(true, node.getColumnFamily());
+    }

Review Comment:
   `resolveColumnFamilyForDiffFile` returns the compaction DAG node's column 
family even when it's null. If the DAG node is missing column-family metadata 
(possible per RocksDiffUtils.shouldSkipNode()), this makes the expected diff 
treat the file as unfiltered (included for every table mask), while the actual 
diff is still filtered by snapshot metadata column family via 
DifferSnapshotVersion, leading to mismatched expectations/flakiness. Prefer 
falling back to snapshot metadata when the DAG node column family is null.



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/test/java/org/apache/ozone/rocksdiff/TestRocksDBCheckpointDiffer.java:
##########
@@ -1064,11 +1099,12 @@ void diffAllSnapshots(RocksDBCheckpointDiffer differ)
         LOG.info("SST diff list from '{}' to '{}': {} tables: {}",
             src.getDbPath(0), snap.getDbPath(0), sstDiffList, tableToLookUp);
 
-        assertEquals(expectedDiffFiles, 
sstDiffList.stream().map(SstFileInfo::getFileName)
-            .collect(Collectors.toList()));
+        List<String> actualNames =
+            
sstDiffList.stream().map(SstFileInfo::getFileName).collect(Collectors.toList());

Review Comment:
   `getSSTDiffList(...)` can return `Optional.empty()` when the compaction DAG 
traversal cannot reach all destination SSTs. The current code collapses that 
into an empty list via `orElse(Collections.emptyList())`, which can incorrectly 
make the subsequent assertions pass (especially when `expectedDiffFiles` is 
empty for that mask). Consider failing fast when the Optional is empty so the 
test still validates DAG completeness.



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/test/java/org/apache/ozone/rocksdiff/TestRocksDBCheckpointDiffer.java:
##########
@@ -1009,33 +1017,60 @@ private static List<ColumnFamilyDescriptor> 
getColumnFamilyDescriptors() {
         .map(ColumnFamilyDescriptor::new).collect(Collectors.toList());
   }
 
+  /**
+   * Resolves column family for an SST id: compaction DAG first, then 
snapshots.
+   *
+   * @return (true, cf) when the SST is known for this run (cf may be null);
+   *     (false, ignored) when the id does not appear (e.g. different SST 
numbering than the hard-coded list).
+   */

Review Comment:
   The `@return` Javadoc still references a “hard-coded list”, but this test 
now derives the baseline diff list dynamically. Updating the wording will keep 
the comment accurate and avoid confusion for future maintainers.



##########
hadoop-hdds/rocksdb-checkpoint-differ/src/test/java/org/apache/ozone/rocksdiff/TestRocksDBCheckpointDiffer.java:
##########
@@ -1009,33 +1017,60 @@ private static List<ColumnFamilyDescriptor> 
getColumnFamilyDescriptors() {
         .map(ColumnFamilyDescriptor::new).collect(Collectors.toList());
   }
 
+  /**
+   * Resolves column family for an SST id: compaction DAG first, then 
snapshots.
+   *
+   * @return (true, cf) when the SST is known for this run (cf may be null);
+   *     (false, ignored) when the id does not appear (e.g. different SST 
numbering than the hard-coded list).
+   */
+  private Pair<Boolean, String> 
resolveColumnFamilyForDiffFile(RocksDBCheckpointDiffer differ, String diffFile,
+      DifferSnapshotInfo srcSnap, DifferSnapshotInfo destSnap) {
+    CompactionNode node = differ.getCompactionNodeMap().get(diffFile);
+    if (node != null) {
+      return Pair.of(true, node.getColumnFamily());
+    }
+    SstFileInfo meta = srcSnap.getSstFile(0, diffFile);
+    if (meta == null) {
+      meta = destSnap.getSstFile(0, diffFile);
+    }
+    if (meta == null) {
+      for (DifferSnapshotInfo s : snapshots) {
+        meta = s.getSstFile(0, diffFile);
+        if (meta != null) {
+          break;
+        }
+      }
+    }
+    if (meta == null) {
+      return Pair.of(false, null);
+    }
+    return Pair.of(true, meta.getColumnFamily());
+  }
+
   /**
    * Test SST differ.
    */
   void diffAllSnapshots(RocksDBCheckpointDiffer differ)
       throws IOException {
     final DifferSnapshotInfo src = snapshots.get(snapshots.size() - 1);
 
-    // Hard-coded expected output.
-    // The results are deterministic. Retrieved from a successful run.
-    final List<List<String>> expectedDifferResult = asList(
-        asList("000023", "000029", "000026", "000019", "000021", "000031"),
-        asList("000023", "000029", "000026", "000021", "000031"),
-        asList("000023", "000029", "000026", "000031"),
-        asList("000029", "000026", "000031"),
-        asList("000029", "000031"),
-        Collections.singletonList("000031"),
-        Collections.emptyList()
-    );
-    assertEquals(snapshots.size(), expectedDifferResult.size());
-
-    int index = 0;
     List<String> expectedDiffFiles = new ArrayList<>();
     for (DifferSnapshotInfo snap : snapshots) {
       // Returns a list of SST files to be fed into RocksCheckpointDiffer Dag.
       List<String> tablesToTrack = new 
ArrayList<>(COLUMN_FAMILIES_TO_TRACK_IN_DAG);
       // Add some invalid index.
       tablesToTrack.add("compactionLogTable");
+      Set<String> fullTableToLookUp = new HashSet<>(tablesToTrack);
+      List<String> baselineDiffFileNames =
+          differ.getSSTDiffList(
+                  new DifferSnapshotVersion(src, 0, fullTableToLookUp),
+                  new DifferSnapshotVersion(snap, 0, fullTableToLookUp),
+                  null, fullTableToLookUp, true)
+              .orElse(Collections.emptyList())
+              .stream()
+              .map(SstFileInfo::getFileName)
+              .collect(Collectors.toList());

Review Comment:
   `baselineDiffFileNames` currently treats 
`getSSTDiffList(...)=Optional.empty()` the same as an empty diff (via 
`orElse(Collections.emptyList())`). `Optional.empty()` is a distinct signal 
meaning the compaction DAG traversal could not reach all destination SSTs, and 
collapsing it to empty can hide regressions by making both expected and actual 
empty. The test should fail fast when the DAG diff is unavailable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to