Wim Symons created OAK-8170:
-------------------------------

             Summary: oak-run datastorecheck and online consistency check 
falsely report missing blobs
                 Key: OAK-8170
                 URL: https://issues.apache.org/jira/browse/OAK-8170
             Project: Jackrabbit Oak
          Issue Type: Bug
          Components: segment-tar
    Affects Versions: 1.8.9
            Reporter: Wim Symons
         Attachments: output.txt

Hi,

We found that oak-run datastorecheck falsely reports missing blobs when running 
datastorecheck without the --verbose option.

Even the online datastore consistency check falsely reports the same missing 
blobs.

This is related due to the fact that the standard blob reference collector in 
oak-run datastorecheck looks at *all* compaction generations in the segment 
store instead of only the last one.

After running an offline compaction, and thus keeping only 1 generation, the 
correct number of blob references and missing blobs is reported by oak-run 
datastorecheck.

The bug on the 1.8 branch comes from 
org.apache.jackrabbit.oak.plugins.blob.BlobReferenceRetriever#collectReferences 
(line 429) and by following that you arrive at 
org.apache.jackrabbit.oak.segment.file.FileStore#tarFiles (line 1013) stating:

tarFiles.collectBlobReferences(collector,
 newOldReclaimer(lastCompactionType, getGcGeneration(), 
gcOptions.getRetainedGenerations()));

I'm not familiar enough with this source code, so I won't attempt adding a 
patch.

I did double-check trunk and saw the same line of code there: 
org.apache.jackrabbit.oak.segment.file.GarbageCollector#collectBlobReferences 
(line 324).

I attached a text file with the outputs of the commands I ran.

We currently use Oak 1.8.9 using AEM 6.4.3.0 and oak-blob-cloud 1.8.9 from the 
1.8.3 AEM S3 connector.

Regards

Wim



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to