[ https://issues.apache.org/jira/browse/BEAM-6831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17131411#comment-17131411 ]
Beam JIRA Bot commented on BEAM-6831: ------------------------------------- This issue was marked "stale-assigned" and has not received a public comment in 7 days. It is now automatically unassigned. If you are still working on it, you can assign it to yourself again. Please also give an update about the status of the work. > python sdk WriteToBigQuery excessive usage of metered API > --------------------------------------------------------- > > Key: BEAM-6831 > URL: https://issues.apache.org/jira/browse/BEAM-6831 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Affects Versions: 2.10.0 > Reporter: Pesach Weinstock > Priority: P2 > Labels: bigquery, dataflow, gcp, python > Attachments: apache-beam-py-sdk-gcp-bq-api-issue.png > > > Right now, there is a potential issue with the python sdk where > {{beam.io.gcp.bigquery.WriteToBigQuery}} calls the following api more often > than needed: > [https://www.googleapis.com/bigquery/v2/projects/<project-name>/datasets/<dataset-name>/tables/<table-name>?alt=json|https://www.googleapis.com/bigquery/v2/projects/%3Cproject-name%3E/datasets/%3Cdataset-name%3E/tables/%3Ctable-name%3E?alt=json] > The above request falls under specific bigquery API quotas which are excluded > from bigquery streaming inserts. When used in a streaming pipeline, we hit > this quota pretty quickly, and cannot proceed to write any further data to > bigquery. > Dispositions being used are: > * create_disposition: {{beam.io.BigQueryDisposition.CREATE_NEVER}} > * write_disposition: {{beam.io.BigQueryDisposition.WRITE_APPEND}} > This is currently blocking us from using bigqueryIO in a streaming pipeline > to write to bigquery, and required us to formally request an API quota > increase from Google to temporarily correct the situation. > Our pipeline uses DataflowRunner. Error seen is below, and in attached > screenshot of stackdriver trace. > {code:java} > "errors": [ > { > "message": "Exceeded rate limits: too many api requests per user per > method for this user_method. For more information, see > https://cloud.google.com/bigquery/troubleshooting-errors", > "domain": "usageLimits", > "reason": "rateLimitExceeded" > } > ], > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)