Julian Sedding created OAK-4473: ----------------------------------- Summary: MarkSweepGarbageCollector#saveBatchToFile should escape IDs Key: OAK-4473 URL: https://issues.apache.org/jira/browse/OAK-4473 Project: Jackrabbit Oak Issue Type: Bug Components: core Affects Versions: 1.2.16, 1.5.3, 1.4.3, 1.0.31 Reporter: Julian Sedding Assignee: Julian Sedding
Datastore garbage collection (DS GC) can fail if it encounters IDs containing backslashes. This can happen e.g. when a file gets uploaded and by mistake it's absolute (windows) path is stored as file name. This is because IDs are written to temporary files and then sorted. The sorting algorithm assumes the lines to be escaped and throws an exception otherwise. {noformat} org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector Blob garbage collection error java.lang.IllegalArgumentException: Unexpected char [J] found at 78 of [92c3bcd2270655a9c911bec9f7a4851860f05c79#553941,/content/dam/\\MAPPED_DRIVE\JOHN$\ABC.pdf]. Expected '\' or 'r' or 'n at org.apache.jackrabbit.oak.commons.sort.EscapeUtils.unescape(EscapeUtils.java:126) at org.apache.jackrabbit.oak.commons.sort.EscapeUtils.unescapeLineBreaks(EscapeUtils.java:51) at org.apache.jackrabbit.oak.commons.sort.ExternalSort.readLine(ExternalSort.java:633) at org.apache.jackrabbit.oak.commons.sort.ExternalSort.sortInBatch(ExternalSort.java:204) at org.apache.jackrabbit.oak.commons.sort.ExternalSort.sortInBatch(ExternalSort.java:257) at org.apache.jackrabbit.oak.commons.sort.ExternalSort.sortInBatch(ExternalSort.java:159) at org.apache.jackrabbit.oak.plugins.blob.GarbageCollectorFileState.sort(GarbageCollectorFileState.java:147) at org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.iterateNodeTree(MarkSweepGarbageCollector.java:538) at org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.mark(MarkSweepGarbageCollector.java:278) at org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.markAndSweep(MarkSweepGarbageCollector.java:248) at org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector.collectGarbage(MarkSweepGarbageCollector.java:163) at org.apache.jackrabbit.oak.plugins.blob.BlobGC$1.call(BlobGC.java:87) at org.apache.jackrabbit.oak.plugins.blob.BlobGC$1.call(BlobGC.java:83) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)