thomasmueller commented on code in PR #771:
URL: https://github.com/apache/jackrabbit-oak/pull/771#discussion_r1035637354


##########
oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/MarkSweepGarbageCollector.java:
##########
@@ -405,13 +405,35 @@ protected void mark(GarbageCollectorFileState fs) throws 
IOException, DataStoreE
 
         // Mark all used references
         iterateNodeTree(fs, false);
-
+        
+        // Get size
+        sizeBlobStoreReferences(fs, stats);
+        
         // Move the marked references file to the data store meta area if 
applicable
         GarbageCollectionType.get(blobStore).addMarked(blobStore, fs, repoId, 
uniqueSuffix);
 
         LOG.debug("Ending mark phase of the garbage collector");
     }
 
+    private static void sizeBlobStoreReferences(GarbageCollectorFileState fs, 
GarbageCollectionOperationStats stats)
+        throws IOException {
+        try (LineIterator lineIterator = new LineIterator(new 
FileReader(fs.getMarkedRefs()))) {
+            lineIterator.forEachRemaining(line -> {
+                String id = line.split(DELIM)[0];

Review Comment:
   I'm not sure about exception handling. If the file is corrupt (eg. empty 
line, corruption length,...), what would happen? It looks like it would throw 
an exception. Instead, a warning should be logged, and then it should continue 
(I assume).



##########
oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/MarkSweepGarbageCollector.java:
##########
@@ -689,6 +711,9 @@ public long checkConsistency() throws Exception {
 
             // Mark all used blob references
             iterateNodeTree(fs, true);
+            // Move the marked references file to the data store meta area if 
applicable
+            String uniqueSuffix = UUID.randomUUID().toString();
+            GarbageCollectionType.get(blobStore).addMarked(blobStore, fs, 
repoId, uniqueSuffix);

Review Comment:
   I'm not sure... it doesn't seem to be related to reporting the size? Not 
saying it is wrong... Just that I would expect it in a different PR.



##########
oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/MarkSweepGarbageCollector.java:
##########
@@ -703,6 +728,21 @@ public long checkConsistency() throws Exception {
                 // Retrieve all other marked present in the datastore
                 List<DataRecord> refFiles =
                     ((SharedDataStore) 
blobStore).getAllMetadataRecords(SharedStoreRecordType.REFERENCES.getType());
+
+                // Get all the repositories registered
+                List<DataRecord> repoFiles =
+                    ((SharedDataStore) 
blobStore).getAllMetadataRecords(SharedStoreRecordType.REPOSITORY.getType());
+                LOG.info("Repositories registered {}", repoFiles);
+
+                // Retrieve repos for which reference files have not been 
created
+                Set<String> unAvailRepos =
+                    SharedDataStoreUtils.refsNotAvailableFromRepos(repoFiles, 
refFiles);
+                LOG.info("Repositories with unavailable references {}", 
unAvailRepos);
+                
+                if (!unAvailRepos.isEmpty()) {
+                    throw new NotAllRepositoryMarkedException("Not all 
repositories have marked references available");

Review Comment:
   Same as above: it doesn't seem to be related to reporting the size? 



##########
oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/MarkSweepGarbageCollector.java:
##########
@@ -1156,6 +1202,10 @@ class GarbageCollectionOperationStats implements 
OperationsStatsMBean {
         static final String NUM_BLOBS_DELETED = "NUM_BLOBS_DELETED";
         static final String TOTAL_SIZE_DELETED = "TOTAL_SIZE_DELETED";
         static final String NUM_CANDIDATES = "NUM_CANDIDATES";
+        
+        static final String NUM_BLOB_REFERENCES = "NUM_BLOB_REFERENCES";

Review Comment:
   I would rename the fields...



##########
oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/MarkSweepGarbageCollector.java:
##########
@@ -405,13 +405,35 @@ protected void mark(GarbageCollectorFileState fs) throws 
IOException, DataStoreE
 
         // Mark all used references
         iterateNodeTree(fs, false);
-
+        
+        // Get size
+        sizeBlobStoreReferences(fs, stats);
+        
         // Move the marked references file to the data store meta area if 
applicable
         GarbageCollectionType.get(blobStore).addMarked(blobStore, fs, repoId, 
uniqueSuffix);
 
         LOG.debug("Ending mark phase of the garbage collector");
     }
 
+    private static void sizeBlobStoreReferences(GarbageCollectorFileState fs, 
GarbageCollectionOperationStats stats)
+        throws IOException {
+        try (LineIterator lineIterator = new LineIterator(new 
FileReader(fs.getMarkedRefs()))) {
+            lineIterator.forEachRemaining(line -> {
+                String id = line.split(DELIM)[0];
+                long length = DataStoreBlobStore.BlobId.of(id).getLength();
+                LOG.info("Blob {} has size {}", id, length);
+
+                stats.getCollector().updateNumBlobReferences(1);
+
+                if (length != -1) {
+                    stats.getCollector().updateSizeBlobReferences(length);

Review Comment:
   what about "addTotalBlobSize"?



##########
oak-blob-plugins/src/main/java/org/apache/jackrabbit/oak/plugins/blob/MarkSweepGarbageCollector.java:
##########
@@ -405,13 +405,35 @@ protected void mark(GarbageCollectorFileState fs) throws 
IOException, DataStoreE
 
         // Mark all used references
         iterateNodeTree(fs, false);
-
+        
+        // Get size
+        sizeBlobStoreReferences(fs, stats);
+        
         // Move the marked references file to the data store meta area if 
applicable
         GarbageCollectionType.get(blobStore).addMarked(blobStore, fs, repoId, 
uniqueSuffix);
 
         LOG.debug("Ending mark phase of the garbage collector");
     }
 
+    private static void sizeBlobStoreReferences(GarbageCollectorFileState fs, 
GarbageCollectionOperationStats stats)
+        throws IOException {
+        try (LineIterator lineIterator = new LineIterator(new 
FileReader(fs.getMarkedRefs()))) {
+            lineIterator.forEachRemaining(line -> {
+                String id = line.split(DELIM)[0];
+                long length = DataStoreBlobStore.BlobId.of(id).getLength();
+                LOG.info("Blob {} has size {}", id, length);
+
+                stats.getCollector().updateNumBlobReferences(1);

Review Comment:
   what about "addTotalBlobCount" instead of "updateNumBlobReferences"?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to