Abacn commented on code in PR #32360:
URL: https://github.com/apache/beam/pull/32360#discussion_r1761163086
##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java:
##########
@@ -1265,35 +1413,62 @@ public PCollection<T> expand(PBegin input) {
checkArgument(getUseLegacySql() != null, "useLegacySql should not be
null if query is set");
}
- checkArgument(getDatumReaderFactory() != null, "A readerDatumFactory is
required");
+ if (getMethod() != TypedRead.Method.DIRECT_READ) {
+ checkArgument(
+ getSelectedFields() == null,
+ "Invalid BigQueryIO.Read: Specifies selected fields, "
+ + "which only applies when using Method.DIRECT_READ");
+
+ checkArgument(
+ getRowRestriction() == null,
+ "Invalid BigQueryIO.Read: Specifies row restriction, "
+ + "which only applies when using Method.DIRECT_READ");
+ } else if (getTableProvider() == null) {
+ checkArgument(
+ getSelectedFields() == null,
+ "Invalid BigQueryIO.Read: Specifies selected fields, "
+ + "which only applies when reading from a table");
+
+ checkArgument(
+ getRowRestriction() == null,
+ "Invalid BigQueryIO.Read: Specifies row restriction, "
+ + "which only applies when reading from a table");
+ }
- // if both toRowFn and fromRowFn values are set, enable Beam schema
support
Pipeline p = input.getPipeline();
BigQueryOptions bqOptions = p.getOptions().as(BigQueryOptions.class);
final BigQuerySourceDef sourceDef = createSourceDef();
+ // schema may need to be requested during graph creation to infer coder
or beam schema
+ TableSchema tableSchema = null;
+
+ // read table schema and infer coder if possible
+ Coder<T> c;
+ if (getCoder() == null) {
+ tableSchema = requestTableSchema(sourceDef, bqOptions,
getSelectedFields());
Review Comment:
Yeah this is a valid concern. I've heard use case where pipeline submission
machine does not or has incomplete permission to the resource, and infer schema
at graph creation time can cause issue. General guideline is the use case used
to work should be able to work still (and vice versa)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]