[ https://issues.apache.org/jira/browse/BEAM-6923?focusedWorklogId=318595&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-318595 ]
ASF GitHub Bot logged work on BEAM-6923: ---------------------------------------- Author: ASF GitHub Bot Created on: 25/Sep/19 20:51 Start Date: 25/Sep/19 20:51 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #9647: [BEAM-6923] limit number of concurrent artifact write to 8 URL: https://github.com/apache/beam/pull/9647#issuecomment-535215624 > > It may be much simpler to set the GCS upload buffer size to 1MiB to solve BEAM-6923 via setting [GcsUploadBufferSize](https://github.com/apache/beam/blob/c2f0d282337f3ae0196a7717712396a5a41fdde1/sdks/java/extensions/google-cloud-platform-core/src/main/java/org/apache/beam/sdk/extensions/gcp/options/GcsOptions.java#L91). I think historically we have used 1MiB in the past while the default is 64MiB > > This makes sense. Do you think we should apply this configuration for all the gcs write. > I am a bit concern as it can impact performance of data path of gcs sink. We can enhance the filesystems configuration to allow passing filesystem specific options. In portable pipeline execution we don't expect to run user transforms for reading/writing from GCS and this would only be for Runner interactions with GCS so I believe we could make it 1MiB for the GcsFileSystem by default. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 318595) Time Spent: 2.5h (was: 2h 20m) > OOM errors in jobServer when using GCS artifactDir > -------------------------------------------------- > > Key: BEAM-6923 > URL: https://issues.apache.org/jira/browse/BEAM-6923 > Project: Beam > Issue Type: Bug > Components: sdk-java-harness > Reporter: Lukasz Gajowy > Assignee: Ankur Goenka > Priority: Major > Attachments: Instance counts.png, Paths to GC root.png, > Telemetries.png, beam6923-flink156.m4v, beam6923flink182.m4v, heapdump > size-sorted.png > > Time Spent: 2.5h > Remaining Estimate: 0h > > When starting jobServer with artifactDir pointing to a GCS bucket: > {code:java} > ./gradlew :beam-runners-flink_2.11-job-server:runShadow > -PflinkMasterUrl=localhost:8081 -PartifactsDir=gs://the-bucket{code} > and running a Java portable pipeline with the following, portability related > pipeline options: > {code:java} > --runner=PortableRunner --jobEndpoint=localhost:8099 > --defaultEnvironmentType=DOCKER > --defaultEnvironmentConfig=gcr.io/<my-freshly-built-sdk-harness-image>/java:latest'{code} > > I'm facing a series of OOM errors, like this: > {code:java} > Exception in thread "grpc-default-executor-3" java.lang.OutOfMemoryError: > Java heap space > at > com.google.api.client.googleapis.media.MediaHttpUploader.buildContentChunk(MediaHttpUploader.java:606) > at > com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:408) > at > com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336) > at > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:508) > at > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:432) > at > com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:549) > at > com.google.cloud.hadoop.util.AbstractGoogleAsyncWriteChannel$UploadOperation.call(AbstractGoogleAsyncWriteChannel.java:301) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745){code} > > This does not happen when I'm using a local filesystem for the artifact > staging location. > -- This message was sent by Atlassian Jira (v8.3.4#803005)