[ 
https://issues.apache.org/jira/browse/OAK-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Wehner updated OAK-4565:
-------------------------------
    Attachment: OAK-4565-TRUNK.patch
                OAK-4565-1.4.patch

Attached are proposed patches (for trunk & 1.4) which spool the metadata record 
to a temporary file and hand that file over to the TransferManager. With this 
patch we were able to successfully run a Blob GC on our 47TB repository for the 
first time.
It's a bit unfortunate that in the GC case the file already exists on disk and 
has to be recreated again, but since the signature of {{SharedDatastore}} 
requires an InputStream I see no other way of solving this. Also it kind of 
slows down the common use case of just creating 0-byte marker records, but this 
is so infrequent that it shouldn't be a problem.

> S3Backend fails to upload large metadata records
> ------------------------------------------------
>
>                 Key: OAK-4565
>                 URL: https://issues.apache.org/jira/browse/OAK-4565
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: blob
>    Affects Versions: 1.4.5
>            Reporter: Martin Wehner
>              Labels: gc, s3
>         Attachments: OAK-4565-1.4.patch, OAK-4565-TRUNK.patch
>
>
> If a large enough metadata record is added to a S3 DS (like the list of blob 
> references collected during the mark phase of the MarkSweepGC) the upload 
> will fail (i.e. never start). This is caused by 
> {{S3Backend.addMetadataRecord()}} providing an InputStream to the S3 
> TransferManager without specifying the size in the Metadata. 
> A warning to this effect is logged by the AWS SDK each time you add a 
> metadata record: 
> {noformat}
> [s3-transfer-manager-worker-1] AmazonS3Client.java:1364 No content length 
> specified for stream data.  Stream contents will be buffered in memory and 
> could result in out of memory errors.
> {noformat}
> Normally this shouldn't be too big of a problem but in a repository with over 
> 36 million blob references the list of marked refs produced by the GC is over 
> 5GB. In this case the S3 transfer worker thread will be stuck in a seemingly 
> endless loop where it tries to allocate the memory reading the file into 
> memory and never finishes (although the JVM has 80GB of heap), eating away 
> resources in the process:
> {noformat}
>    java.lang.Thread.State: RUNNABLE
>       at org.apache.http.util.ByteArrayBuffer.append(ByteArrayBuffer.java:90)
>       at org.apache.http.util.EntityUtils.toByteArray(EntityUtils.java:137)
>       at 
> org.apache.http.entity.BufferedHttpEntity.<init>(BufferedHttpEntity.java:63)
>       at 
> com.amazonaws.http.HttpRequestFactory.newBufferedHttpEntity(HttpRequestFactory.java:247)
>       at 
> com.amazonaws.http.HttpRequestFactory.createHttpRequest(HttpRequestFactory.java:126)
>       at 
> com.amazonaws.http.AmazonHttpClient$ExecOneRequestParams.newApacheRequest(AmazonHttpClient.java:650)
>       at 
> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:730)
>       at 
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
>       at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
>       at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
>       at 
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1382)
>       at 
> com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:131)
>       at 
> com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:123)
>       at 
> com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
>       at 
> com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> {noformat} 
> The last log message by the GC thread will be like this:
> {noformat}
> *INFO* [sling-oak-observation-1273] 
> org.apache.jackrabbit.oak.plugins.blob.MarkSweepGarbageCollector Number of 
> valid blob references marked under mark phase of Blob garbage collection 
> [36147734]
> {noformat} 
> followed by the above AWS warning, then it will stall waiting for the 
> transfer to finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to