[ https://issues.apache.org/jira/browse/BEAM-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207613#comment-16207613 ]
Vincent Spiewak edited comment on BEAM-2840 at 10/17/17 1:06 PM: ----------------------------------------------------------------- [~reuvenlax] You can see the error at {code:java} 2017-10-16_08_33_38-17765248162467300750 {code} was (Author: vspiewak): [~reuvenlax] You can see the error at `2017-10-16_08_33_38-17765248162467300750` > BigQueryIO write is slow/fail with a bounded source > --------------------------------------------------- > > Key: BEAM-2840 > URL: https://issues.apache.org/jira/browse/BEAM-2840 > Project: Beam > Issue Type: Bug > Components: sdk-java-gcp > Affects Versions: 2.0.0 > Environment: Gougle Cloud Platform > Reporter: Vincent Spiewak > Assignee: Reuven Lax > Attachments: PrepareWrite.BatchLoads.png > > > BigQueryIO Writer is slow / fail if the input source is bounded. > EDIT: Input BQ: 294 GB, 741,896,827 events > If the input source is bounded (GCS / BQ select / ...), BigQueryIO Writer use > the > "[Method.FILE_LOADS|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1168]" > instead of streaming inserts. > Large amounts of input datas result in a java.lang.OutOfMemoryError / Java > heap space (500 millions rows). > !PrepareWrite.BatchLoads.png|thumbnail! > We cannot use "Method.STREAMING_INSERTS" or control the batchs sizes since > [withMaxFilesPerBundle|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1131] > is private :( > Someone reported a similar problem with GCS -> BQ on Stackoverflow: > [Why is writing to BigQuery from a Dataflow/Beam pipeline > slow?|https://stackoverflow.com/questions/45889992/why-is-writing-to-bigquery-from-a-dataflow-beam-pipeline-slow#comment78954153_45889992] -- This message was sent by Atlassian JIRA (v6.4.14#64029)