rajkgupt opened a new issue, #24001:
URL: https://github.com/apache/beam/issues/24001
### What would you like to happen?
**Problem Encountered:**
For a BigQueryIO.Write configured like in [1], the if target table doesn’t
exist, then pipeline throws 404 Table Not Found exception and continuously
retries the work item [2].
Whereas for insert errors (broken json or schema error), it is able to catch
the error (via getFailedInsertsWithErr)
It was most recently reproduced on Apache Beam SDK for Java 2.39.0
**What you expected to happen:**
Table not found errors should be caught by getFailedInsertsWithErr so that
those records can be handled separately (like writing to dead letter queue or
to GCS etc.)
[1]
`WriteResult writeResult =
results.get(SUCCESS_TAG).apply("WriteSuccessfulRecordsToBQ",
BigQueryIO.writeTableRows()
.withMethod(BigQueryIO.Write.Method.STREAMING_INSERTS)
.withFailedInsertRetryPolicy(InsertRetryPolicy.retryTransientErrors()) //Retry
all failures except for known persistent errors.
.withWriteDisposition(WRITE_APPEND)
.withCreateDisposition(CREATE_NEVER)
.withExtendedErrorInfo() //- getFailedInsertsWithErr
.ignoreUnknownValues()
.skipInvalidRows()
.withoutValidation()
.to((row) -> {
String tableName =
Objects.requireNonNull(row.getValue()).get("event_type").toString();
return new TableDestination(String.format("%s:%s.%s",
BQ_PROJECT, BQ_DATASET, tableName), "Some destination");
})`
[2]
`Error message from worker: java.lang.RuntimeException:
com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
POST
https://bigquery.googleapis.com/bigquery/v2/projects/dfdfdfdfdfd/datasets/sdfsdfdsfsfs/tables/dddddddd/insertAll?prettyPrint=false
{
"code" : 404,
"errors" : [ {
"domain" : "global",
"message" : "Not found: Table dfdfdfdfdfd:sdfsdfdsfsfs.dddddddd",
"reason" : "notFound"
} ],
"message" : "Not found: Table dfdfdfdfdfd:sdfsdfdsfsfs.dddddddd",
"status" : "NOT_FOUND"
}
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:1108)
org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:1161)`
### Issue Priority
Priority: 2
### Issue Component
Component: sdk-java-core
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]