[ 
https://issues.apache.org/jira/browse/BEAM-6923?focusedWorklogId=318593&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-318593
 ]

ASF GitHub Bot logged work on BEAM-6923:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/Sep/19 20:47
            Start Date: 25/Sep/19 20:47
    Worklog Time Spent: 10m 
      Work Description: lukecwik commented on pull request #9647: [BEAM-6923] 
limit number of concurrent artifact write to 8
URL: https://github.com/apache/beam/pull/9647#discussion_r328332929
 
 

 ##########
 File path: 
runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java
 ##########
 @@ -227,9 +235,11 @@ public void onNext(PutArtifactRequest putArtifactRequest) 
{
                   encodedFileName(metadata.getMetadata()), 
StandardResolveOptions.RESOLVE_FILE);
           LOG.debug(
               "Going to stage artifact {} to {}.", 
metadata.getMetadata().getName(), artifactId);
-          artifactWritableByteChannel = FileSystems.create(artifactId, 
MimeTypes.BINARY);
           hasher = Hashing.sha256().newHasher();
+          permittedConcurrentWrite.acquire();
 
 Review comment:
   The FileSystem may push the write to another thread unblocking the gRPC 
thread.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 318593)
    Time Spent: 2h 20m  (was: 2h 10m)

> OOM errors in jobServer when using GCS artifactDir
> --------------------------------------------------
>
>                 Key: BEAM-6923
>                 URL: https://issues.apache.org/jira/browse/BEAM-6923
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-harness
>            Reporter: Lukasz Gajowy
>            Assignee: Ankur Goenka
>            Priority: Major
>         Attachments: Instance counts.png, Paths to GC root.png, 
> Telemetries.png, beam6923-flink156.m4v, beam6923flink182.m4v, heapdump 
> size-sorted.png
>
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> When starting jobServer with artifactDir pointing to a GCS bucket: 
> {code:java}
> ./gradlew :beam-runners-flink_2.11-job-server:runShadow 
> -PflinkMasterUrl=localhost:8081 -PartifactsDir=gs://the-bucket{code}
> and running a Java portable pipeline with the following, portability related 
> pipeline options: 
> {code:java}
> --runner=PortableRunner --jobEndpoint=localhost:8099 
> --defaultEnvironmentType=DOCKER 
> --defaultEnvironmentConfig=gcr.io/<my-freshly-built-sdk-harness-image>/java:latest'{code}
>  
> I'm facing a series of OOM errors, like this: 
> {code:java}
> Exception in thread "grpc-default-executor-3" java.lang.OutOfMemoryError: 
> Java heap space
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.buildContentChunk(MediaHttpUploader.java:606)
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:408)
> at 
> com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:508)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432)
> at 
> com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:549)
> at 
> com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:301)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745){code}
>  
> This does not happen when I'm using a local filesystem for the artifact 
> staging location. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to