gemini-code-assist[bot] commented on code in PR #37079:
URL: https://github.com/apache/beam/pull/37079#discussion_r2608890211
##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java:
##########
@@ -2293,6 +2302,18 @@ public TypedRead<T> withMethod(TypedRead.Method method) {
return toBuilder().setMethod(method).build();
}
+ /**
+ * Sets the timestamp precision to request for TIMESTAMP(12) BigQuery
columns when reading via
+ * the Storage Read API.
+ *
+ * <p>This option only affects precision of TIMESTAMP(12) column reads
using {@link
+ * Method#DIRECT_READ}. The default is {@link
+ * org.apache.beam.sdk.io.gcp.bigquery.TimestampPrecision#NANOS}.
Review Comment:

The Javadoc states that the default precision is `NANOS`. However, if
`withTimestampPrecision` is not called, no precision is sent to the BigQuery
Storage API, which then defaults to `MICROS`. The test `runTimestampTest` also
confirms that the effective default is `MICROS` when the option is not
specified. Please update the documentation to reflect that `MICROS` is the
default precision.
```suggestion
* org.apache.beam.sdk.io.gcp.bigquery.TimestampPrecision#MICROS}.
```
##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageSourceBase.java:
##########
@@ -199,4 +205,41 @@ public List<BigQueryStorageStreamSource<T>> split(
public BoundedReader<T> createReader(PipelineOptions options) throws
IOException {
throw new UnsupportedOperationException("BigQuery storage source must be
split before reading");
}
+
+ private void setTimestampPrecision(
+ ReadSession.TableReadOptions.Builder tableReadOptionsBuilder, DataFormat
dataFormat) {
+ if (timestampPrecision == null) {
+ return;
+ }
+ switch (timestampPrecision) {
+ case NANOS:
+ if (dataFormat == DataFormat.ARROW) {
+ tableReadOptionsBuilder.setArrowSerializationOptions(
+ ArrowSerializationOptions.newBuilder()
+ .setPicosTimestampPrecision(
+
ArrowSerializationOptions.PicosTimestampPrecision.TIMESTAMP_PRECISION_NANOS));
+ } else {
+ tableReadOptionsBuilder.setAvroSerializationOptions(
+ AvroSerializationOptions.newBuilder()
+ .setPicosTimestampPrecision(
+
AvroSerializationOptions.PicosTimestampPrecision.TIMESTAMP_PRECISION_NANOS));
+ }
+ break;
+ case PICOS:
+ if (dataFormat == DataFormat.ARROW) {
+ tableReadOptionsBuilder.setArrowSerializationOptions(
+ ArrowSerializationOptions.newBuilder()
+ .setPicosTimestampPrecision(
+
ArrowSerializationOptions.PicosTimestampPrecision.TIMESTAMP_PRECISION_PICOS));
+ } else {
+ tableReadOptionsBuilder.setAvroSerializationOptions(
+ AvroSerializationOptions.newBuilder()
+ .setPicosTimestampPrecision(
+
AvroSerializationOptions.PicosTimestampPrecision.TIMESTAMP_PRECISION_PICOS));
+ }
+ break;
+ default:
+ break;
+ }
+ }
Review Comment:

The `setTimestampPrecision` method has a nested structure that leads to code
duplication for handling `ARROW` and `AVRO` formats. This can be refactored to
first switch on the `dataFormat` and then handle the precision. This would make
the code more readable and easier to maintain.
```java
private void setTimestampPrecision(
ReadSession.TableReadOptions.Builder tableReadOptionsBuilder,
DataFormat dataFormat) {
if (timestampPrecision == null || timestampPrecision ==
TimestampPrecision.MICROS) {
// MICROS is the default for the Storage API, so we don't need to set
anything.
return;
}
if (dataFormat == DataFormat.ARROW) {
ArrowSerializationOptions.Builder arrowOptions =
ArrowSerializationOptions.newBuilder();
switch (timestampPrecision) {
case NANOS:
arrowOptions.setPicosTimestampPrecision(
ArrowSerializationOptions.PicosTimestampPrecision.TIMESTAMP_PRECISION_NANOS);
break;
case PICOS:
arrowOptions.setPicosTimestampPrecision(
ArrowSerializationOptions.PicosTimestampPrecision.TIMESTAMP_PRECISION_PICOS);
break;
default:
// Should be unreachable given the check above.
return;
}
tableReadOptionsBuilder.setArrowSerializationOptions(arrowOptions);
} else if (dataFormat == DataFormat.AVRO) {
AvroSerializationOptions.Builder avroOptions =
AvroSerializationOptions.newBuilder();
switch (timestampPrecision) {
case NANOS:
avroOptions.setPicosTimestampPrecision(
AvroSerializationOptions.PicosTimestampPrecision.TIMESTAMP_PRECISION_NANOS);
break;
case PICOS:
avroOptions.setPicosTimestampPrecision(
AvroSerializationOptions.PicosTimestampPrecision.TIMESTAMP_PRECISION_PICOS);
break;
default:
// Should be unreachable.
return;
}
tableReadOptionsBuilder.setAvroSerializationOptions(avroOptions);
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]