[ 
https://issues.apache.org/jira/browse/BEAM-9492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Cullen updated BEAM-9492:
-----------------------------
    Description: 
When streaming inserts into BigQuery using BigQueryIO.Write, if there is an 
error other than a row insertion error (e.g. an IOException), BigQueryIO 
assumes this must be a rate limit error, and so enters an [infinite loop of 
retries|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L785].
 This is logical for rate limit errors, but other types of error may also 
appear. 

 

One example is a "Not found" error for the BigQuery table. This can happen if 
the table was originally created by BigQueryIO.Write (where CreateDisposition 
is CREATE_IF_NEEDED), but has since been deleted (since created tables are 
[cached|[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java#L61]]).
 This may be more likely to happen in a long-term streaming job. The infinite 
loop of retries does not help, as table creation is not retried, only row 
insertion. The table either needs to be created with an external process, or 
the pipeline needs to be restarted (thereby clearing the cache).

 

This can be the case regardless of setting InsertRetryPolicy, as these errors 
are not insertion errors. As a result, we see logs such as "INFO: BigQuery 
insertAll error, retrying: Not found: Table <project>:<dataset>.<table>" even 
if we have set InsertRetryPolicy to "neverRetry()", which is confusing 
behaviour. I expect similar issues to occur for other types of error (e.g. no 
response from BigQuery API).

 

To recreate this issue, you can create a pipeline which inserts into BigQuery 
using BigQueryIO, where the table does not exist beforehand but should be 
created by BigQueryIO (i.e. CreateDisposition = CREATE_IF_NEEDED). Then mock 
BigQueryServicesImpl's call to create the BigQuery table, causing no table to 
be created (I did this in a brute force method by creating my own 
BigQueryServicesImpl and feeding in using ".withTestServices()"). The pipeline 
will enter an infinite loop, logging "INFO: BigQuery insertAll error, retrying: 
Not found: Table <project>:<dataset>.<table>".

 

One suggestion to avoid this is to add another retry policy, one to control 
retries for non-insertion errors. This could be optional for users and 
effective 
[here|[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L797]].
 An alternative/additional option could be to check for "table not found" 
errors in this clause and if encountered, retry table creation before next 
retrying insertion.

  was:
When streaming inserts into BigQuery using BigQueryIO.Write, if there is an 
error other than a row insertion error (e.g. an IOException), BigQueryIO 
assumes this must be a rate limit error, and so enters an [infinite loop of 
retries|[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L785]].
 This is logical for rate limit errors, but other types of error may also 
appear. 

 

One example is a "Not found" error for the BigQuery table. This can happen if 
the table was originally created by BigQueryIO.Write (where CreateDisposition 
is CREATE_IF_NEEDED), but has since been deleted (since created tables are 
[cached|[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java#L61]]).
 This may be more likely to happen in a long-term streaming job. The infinite 
loop of retries does not help, as table creation is not retried, only row 
insertion. The table either needs to be created with an external process, or 
the pipeline needs to be restarted (thereby clearing the cache).

 

This can be the case regardless of setting InsertRetryPolicy, as these errors 
are not insertion errors. As a result, we see logs such as "INFO: BigQuery 
insertAll error, retrying: Not found: Table <project>:<dataset>.<table>" even 
if we have set InsertRetryPolicy to "neverRetry()", which is confusing 
behaviour. I expect similar issues to occur for other types of error (e.g. no 
response from BigQuery API).

 

To recreate this issue, you can create a pipeline which inserts into BigQuery 
using BigQueryIO, where the table does not exist beforehand but should be 
created by BigQueryIO (i.e. CreateDisposition = CREATE_IF_NEEDED). Then mock 
BigQueryServicesImpl's call to create the BigQuery table, causing no table to 
be created (I did this in a brute force method by creating my own 
BigQueryServicesImpl and feeding in using ".withTestServices()"). The pipeline 
will enter an infinite loop, logging "INFO: BigQuery insertAll error, retrying: 
Not found: Table <project>:<dataset>.<table>".

 

One suggestion to avoid this is to add another retry policy, one to control 
retries for non-insertion errors. This could be optional for users and 
effective 
[here|[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L797]].
 An alternative/additional option could be to check for "table not found" 
errors in this clause and if encountered, retry table creation before next 
retrying insertion.


> Allow retry policy for non-insertion errors in BigQueryIO
> ---------------------------------------------------------
>
>                 Key: BEAM-9492
>                 URL: https://issues.apache.org/jira/browse/BEAM-9492
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-gcp
>            Reporter: Joe Cullen
>            Priority: Minor
>
> When streaming inserts into BigQuery using BigQueryIO.Write, if there is an 
> error other than a row insertion error (e.g. an IOException), BigQueryIO 
> assumes this must be a rate limit error, and so enters an [infinite loop of 
> retries|https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L785].
>  This is logical for rate limit errors, but other types of error may also 
> appear. 
>  
> One example is a "Not found" error for the BigQuery table. This can happen if 
> the table was originally created by BigQueryIO.Write (where CreateDisposition 
> is CREATE_IF_NEEDED), but has since been deleted (since created tables are 
> [cached|[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/CreateTables.java#L61]]).
>  This may be more likely to happen in a long-term streaming job. The infinite 
> loop of retries does not help, as table creation is not retried, only row 
> insertion. The table either needs to be created with an external process, or 
> the pipeline needs to be restarted (thereby clearing the cache).
>  
> This can be the case regardless of setting InsertRetryPolicy, as these errors 
> are not insertion errors. As a result, we see logs such as "INFO: BigQuery 
> insertAll error, retrying: Not found: Table <project>:<dataset>.<table>" even 
> if we have set InsertRetryPolicy to "neverRetry()", which is confusing 
> behaviour. I expect similar issues to occur for other types of error (e.g. no 
> response from BigQuery API).
>  
> To recreate this issue, you can create a pipeline which inserts into BigQuery 
> using BigQueryIO, where the table does not exist beforehand but should be 
> created by BigQueryIO (i.e. CreateDisposition = CREATE_IF_NEEDED). Then mock 
> BigQueryServicesImpl's call to create the BigQuery table, causing no table to 
> be created (I did this in a brute force method by creating my own 
> BigQueryServicesImpl and feeding in using ".withTestServices()"). The 
> pipeline will enter an infinite loop, logging "INFO: BigQuery insertAll 
> error, retrying: Not found: Table <project>:<dataset>.<table>".
>  
> One suggestion to avoid this is to add another retry policy, one to control 
> retries for non-insertion errors. This could be optional for users and 
> effective 
> [here|[https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServicesImpl.java#L797]].
>  An alternative/additional option could be to check for "table not found" 
> errors in this clause and if encountered, retry table creation before next 
> retrying insertion.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to