ahmedabu98 commented on PR #27434: URL: https://github.com/apache/beam/pull/27434#issuecomment-1629658908
>That's awesome! curious about what was wrong and how it is fixed Thanks! yeah it took a while to figure out because our fake testing service doesn't propagate an error. I tried running the same pipeline a large number of times with a real BigQuery table and ran into the same behavior (copy job stuck retrying) on one occurrence. BQ was giving a "table already exists" error and kept retrying the copy job. Write disposition defaults to `WRITE_EMPTY`, so I set it to `WRITE_APPEND` and reran many times without running into the same behavior. We're supposed to already be covering this here (note this test is a streaming pipeline): https://github.com/apache/beam/blob/4c66866aa9544d1796c7c3880192cb57d2a8dcc0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteRename.java#L187-L196 ie. we always set it to `WRITE_APPEND` after the first trigger of copy jobs. Somehow this doesn't always work? and the user-specified disposition continues to be used. I've run the tests in a few different ways and the common denominator seems to be that this is happening when the table is created beforehand (as opposed to letting the pipeline create it) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
