[ https://issues.apache.org/jira/browse/BEAM-6443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Heejong Lee updated BEAM-6443: ------------------------------ Summary: decrease the number of threads for BigQuery streaming insertAll (was: decrease the number of thread for BigQuery streaming insertAll) > decrease the number of threads for BigQuery streaming insertAll > --------------------------------------------------------------- > > Key: BEAM-6443 > URL: https://issues.apache.org/jira/browse/BEAM-6443 > Project: Beam > Issue Type: Improvement > Components: io-java-gcp > Reporter: Heejong Lee > Assignee: Heejong Lee > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > When inserting (a large number of ) very small elements into BigQuery via > streaming insertAll, BigQueryIO causes lots of quota exceeded errors. This > implies that 1) BigQueryIO puts unnecessary overheads on BigQuery API layer > by sending requests too fast 2) log file becomes very big because of repeated > same error messages. Currently we use 50 shards for writing data into > BigQuery and in each bundle 20-30 futures are executed simultaneously with > unlimited thread pool. It would be worth investigating whether just single > thread pool is sufficient for running concurrent insertAll. -- This message was sent by Atlassian JIRA (v7.6.3#76005)