Pesach Weinstock created BEAM-6831:
--------------------------------------
Summary: python sdk WriteToBigQuery excessive usage of metered API
Key: BEAM-6831
URL: https://issues.apache.org/jira/browse/BEAM-6831
Project: Beam
Issue Type: Bug
Components: sdk-py-core
Affects Versions: 2.10.0
Reporter: Pesach Weinstock
Attachments: apache-beam-py-sdk-gcp-bq-api-issue.png
Right now, there is a potential issue with the python sdk where
{{beam.io.gcp.bigquery.WriteToBigQuery}} calls the following api more often
than needed:
[https://www.googleapis.com/bigquery/v2/projects/<project-name>/datasets/<dataset-name>/tables/<table-name>?alt=json|https://www.googleapis.com/bigquery/v2/projects/%3Cproject-name%3E/datasets/%3Cdataset-name%3E/tables/%3Ctable-name%3E?alt=json]
The above request falls under specific bigquery API quotas which are excluded
from bigquery streaming inserts. When used in a streaming pipeline, we hit this
quota pretty quickly, and cannot proceed to write any further data to bigquery.
Dispositions being used are:
* create_disposition: {{beam.io.BigQueryDisposition.CREATE_NEVER}}
* write_disposition: {{beam.io.BigQueryDisposition.WRITE_APPEND}}
This is currently blocking us from using bigqueryIO in a streaming pipeline to
write to bigquery, and required us to formally request an API quota increase
from Google to temporarily correct the situation.
Our pipeline uses DataflowRunner. I am unable to attach screenshots to this
JIRA, but the following message is received in logs:
{code:java}
"error": {
"code": 403,
"message": "Exceeded rate limits: too many api requests per user per method
for this user_method. For more information, see
https://cloud.google.com/bigquery/troubleshooting-errors",
"errors": [
{
"message": "Exceeded rate limits: too many api requests per user per
method for this user_method. For more information, see
https://cloud.google.com/bigquery/troubleshooting-errors",
"domain": "usageLimits",
"reason": "rateLimitExceeded"
}
],
"status": "PERMISSION_DENIED"
}{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)