Julian Sedding created OAK-4473:
-----------------------------------

             Summary: MarkSweepGarbageCollector#saveBatchToFile should escape 
IDs
                 Key: OAK-4473
                 URL: https://issues.apache.org/jira/browse/OAK-4473
             Project: Jackrabbit Oak
          Issue Type: Bug
          Components: core
    Affects Versions: 1.2.16, 1.5.3, 1.4.3, 1.0.31
            Reporter: Julian Sedding
            Assignee: Julian Sedding


Datastore garbage collection (DS GC) can fail if it encounters IDs containing 
backslashes. This can happen e.g. when a file gets uploaded and by mistake it's 
absolute (windows) path is stored as file name. 

This is because IDs are written to temporary files and then sorted. The sorting 
algorithm assumes the lines to be escaped and throws an exception otherwise.

{noformat}
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector Blob garbage 
collection error
java.lang.IllegalArgumentException: Unexpected char [J] found at 78 of 
[92c3bcd2270655a9c911bec9f7a4851860f05c79#553941,/content/dam/\\MAPPED_DRIVE\JOHN$\ABC.pdf].
 Expected '\' or 'r' or 'n
        at 
org.apache.jackrabbit.oak.commons.sort.EscapeUtils.unescape(EscapeUtils.java:126)
        at 
org.apache.jackrabbit.oak.commons.sort.EscapeUtils.unescapeLineBreaks(EscapeUtils.java:51)
        at 
org.apache.jackrabbit.oak.commons.sort.ExternalSort.readLine(ExternalSort.java:633)
        at 
org.apache.jackrabbit.oak.commons.sort.ExternalSort.sortInBatch(ExternalSort.java:204)
        at 
org.apache.jackrabbit.oak.commons.sort.ExternalSort.sortInBatch(ExternalSort.java:257)
        at 
org.apache.jackrabbit.oak.commons.sort.ExternalSort.sortInBatch(ExternalSort.java:159)
        at 
org.apache.jackrabbit.oak.plugins.blob.GarbageCollectorFileState.sort(GarbageCollectorFileState.java:147)
        at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.iterateNodeTree(MarkSweepGarbageCollector.java:538)
        at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.mark(MarkSweepGarbageCollector.java:278)
        at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.markAndSweep(MarkSweepGarbageCollector.java:248)
        at 
org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.collectGarbage(MarkSweepGarbageCollector.java:163)
        at org.apache.jackrabbit.oak.plugins.blob.BlobGC$1.call(BlobGC.java:87)
        at org.apache.jackrabbit.oak.plugins.blob.BlobGC$1.call(BlobGC.java:83)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to