Heejong Lee created BEAM-6443:
---------------------------------

             Summary: decrease the number of thread for BigQuery streaming 
insertAll
                 Key: BEAM-6443
                 URL: https://issues.apache.org/jira/browse/BEAM-6443
             Project: Beam
          Issue Type: Improvement
          Components: io-java-gcp
            Reporter: Heejong Lee
            Assignee: Heejong Lee


When inserting (a large number of ) very small elements into BigQuery via 
streaming insertAll, BigQueryIO causes lots of quota exceeded errors. This 
implies that 1) BigQueryIO puts unnecessary overheads on BigQuery API layer by 
sending requests too fast 2) log file becomes very big because of repeated same 
error messages. Currently we use 50 shards for writing data into BigQuery and 
in each bundle 20-30 futures are executed simultaneously with unlimited thread 
pool. It would be worth investigating whether just single thread pool is 
sufficient for running concurrent insertAll.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to