dlmarion commented on code in PR #2767: URL: https://github.com/apache/accumulo/pull/2767#discussion_r894713509
########## core/src/main/java/org/apache/accumulo/core/gc/Reference.java: ########## @@ -19,50 +19,27 @@ package org.apache.accumulo.core.gc; import org.apache.accumulo.core.data.TableId; -import org.apache.accumulo.core.metadata.schema.MetadataSchema; /** - * A GC reference to a tablet file or directory. + * A GC reference used for collecting files and directories into a single stream. */ -public class Reference implements Comparable<Reference> { - // parts of an absolute URI, like "hdfs://1.2.3.4/accumulo/tables/2a/t-0003" - public final TableId tableId; // 2a +public interface Reference { + /** + * Only return true if the reference is a directory. + */ + boolean isDirectory(); - // the exact string that is stored in the metadata - public final String metadataEntry; + /** + * Get the {@link TableId} of the reference. + */ + TableId getTableId(); - public Reference(TableId tableId, String metadataEntry) { - MetadataSchema.TabletsSection.ServerColumnFamily.validateDirCol(tableId.canonical()); - this.tableId = tableId; - this.metadataEntry = metadataEntry; - } - - @Override - public int compareTo(Reference that) { - if (equals(that)) { - return 0; - } else { - return this.metadataEntry.compareTo(that.metadataEntry); - } - } - - @Override - public boolean equals(Object obj) { - if (this == obj) - return true; - if (obj == null) - return false; - if (getClass() != obj.getClass()) - return false; - Reference other = (Reference) obj; - if (metadataEntry == null) { - return other.metadataEntry == null; - } else - return metadataEntry.equals(other.metadataEntry); - } - - @Override - public int hashCode() { - return this.metadataEntry.hashCode(); - } + /** + * Get the exact string stored in the metadata table for this file or directory. A file will be + * read from the Tablet "file" column family: Review Comment: I'm not sure that this is true. Certainly file references will be created using the `file` colf int he metadata table, but also the `scan` entries. The ScanServer feature adds a new section of references to the metadata table for scan server related file references. ########## core/src/main/java/org/apache/accumulo/core/gc/ReferenceDirectory.java: ########## @@ -21,13 +21,22 @@ import org.apache.accumulo.core.data.TableId; /** - * Part of the Tablet File path that is definitely a directory. + * A GC reference to a Tablet directory, like t-0003. */ -public class ReferenceDirectory extends Reference { - public final String tabletDir; // t-0003 +public class ReferenceDirectory extends ReferenceFile { + private final String tabletDir; // t-0003 public ReferenceDirectory(TableId tableId, String dirName) { super(tableId, dirName); Review Comment: In this case, the `metadataEntry` is only the dirname, like `t-0003` ? Or, is the example wrong? ########## server/gc/src/main/java/org/apache/accumulo/gc/GCRun.java: ########## @@ -139,21 +140,29 @@ public Stream<String> getBlipPaths() throws TableNotFoundException { public Stream<Reference> getReferences() { Stream<TabletMetadata> tabletStream; + // create a stream of metadata entries read from file, scan and tablet dir columns if (level == Ample.DataLevel.ROOT) { tabletStream = Stream.of(context.getAmple().readTablet(RootTable.EXTENT, DIR, FILES, SCANS)); } else { - tabletStream = TabletsMetadata.builder(context).scanTable(level.metaTable()) - .checkConsistency().fetch(DIR, FILES, SCANS).build().stream(); + try (var tabletsMetadata = TabletsMetadata.builder(context).scanTable(level.metaTable()) + .checkConsistency().fetch(DIR, FILES, SCANS).build()) { + tabletStream = tabletsMetadata.stream(); + } } + // there is a lot going on in this "one line" so see below for more info return tabletStream.flatMap(tm -> { - Stream<Reference> refs = Stream.concat(tm.getFiles().stream(), tm.getScans().stream()) - .map(f -> new Reference(tm.getTableId(), f.getMetaUpdateDelete())); + // combine all the entries read from file and scan columns in the metadata table + var fileStream = Stream.concat(tm.getFiles().stream(), tm.getScans().stream()); + // map the files to Reference objects + var stream = fileStream.map(f -> new ReferenceFile(tm.getTableId(), f.getMetaUpdateDelete())); + // if dirName is populated then we have a tablet directory aka srv:dir if (tm.getDirName() != null) { - refs = Stream.concat(refs, - Stream.of(new ReferenceDirectory(tm.getTableId(), tm.getDirName()))); + // add the tablet directory to the stream + var tabletDir = new ReferenceDirectory(tm.getTableId(), tm.getDirName()); Review Comment: Is `tm.getDirName()` a fully qualified path, or just the name of the directory? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org