[ 
https://issues.apache.org/jira/browse/BEAM-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16665747#comment-16665747
 ] 

Raghu Angadi commented on BEAM-5514:
------------------------------------

> 1. Handle quotaExceeded within the client with a backoff retry.
Agreed. Quota Excceeded should be treated same as 'Rate Limited'. I think they 
are logically the same thing w.r.t BigQueryIO.
A note about worker retries : It does not do exponential backoff, but it does 
wait 10 seconds before re-running a failed bundle ('work item' in Dataflow 
terminology), which is actually quite high.

The main issue is the backoff mechanism itself. 'insertAll' in BigQueryIO uses 
an unlimited thread pool execute each insert from a separate thread. There 
could be thousands of inserts in a bundle. The backoff is calculated for each 
insert independently.. so we could 1000 threads each backing of a bit.. which 
does not really help cut down the load.

Over all we should control the over all rate (by reducing both the parallism 
and the frequency of retries within each thread). As such I think we could use 
a smaller pool to insert, but I am not sure what the right size is. A simple 
policy could be to multiply retry time by number active inserts : next_retry = 
backoff(num_retries) * num_active_inserts.






> BigQueryIO doesn't handle quotaExceeded errors properly
> -------------------------------------------------------
>
>                 Key: BEAM-5514
>                 URL: https://issues.apache.org/jira/browse/BEAM-5514
>             Project: Beam
>          Issue Type: Bug
>          Components: io-java-gcp
>            Reporter: Kevin Peterson
>            Assignee: Chamikara Jayalath
>            Priority: Major
>
> When exceeding a streaming quota for BigQuery insertAll requests, BigQuery 
> returns a 403 with reason "quotaExceeded".
> The current implementation of BigQueryIO does not consider this to be a rate 
> limited exception, and therefore does not perform exponential backoff 
> properly, leading to repeated calls to BQ.
> The actual error is in the 
> [ApiErrorExtractor|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L739]
>  class, which is called from 
> [BigQueryServicesImpl|https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/util/src/main/java/com/google/cloud/hadoop/util/ApiErrorExtractor.java#L263]
>  to determine how to retry the failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to