Abacn commented on code in PR #36425:
URL: https://github.com/apache/beam/pull/36425#discussion_r2461432617


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java:
##########
@@ -2467,10 +2525,13 @@ abstract static class Builder<T> {
       abstract Builder<T> setTableFunction(
           SerializableFunction<ValueInSingleWindow<T>, TableDestination> 
tableFunction);
 
-      abstract Builder<T> setFormatFunction(SerializableFunction<T, TableRow> 
formatFunction);
+      abstract Builder<T> setFormatFunction(
+          SerializableBiFunction<TableRowToStorageApiProto.SchemaInformation, 
T, TableRow>
+              formatFunction);
 
       abstract Builder<T> setFormatRecordOnFailureFunction(
-          SerializableFunction<T, TableRow> formatFunction);
+          SerializableBiFunction<TableRowToStorageApiProto.SchemaInformation, 
T, TableRow>

Review Comment:
   If I understand correctly, the purpose of changing format function to be a 
BiFunctions is to
   
   > we had to extend the internal format function to take extra information 
about the BigQuery schema.
   
   I see the first parameter is ignored in most cases (in the cases of 
batchload / streaming insert).
   
   Would it be cleaner to use a dedicated interfaces for internal format 
function now, e.g.,
   
   ```
     interface  TableRowFormatFunction<T> extends SerializableFunction<T, 
TableRow> {
       // inhereted: TableRow apply(T element);
   
       default TableRow apply(TableRowToStorageApiProto.SchemaInformation info, 
T element) {
         throw new UnsupportedOperationException("Not implemented");
       }
   
       static <T> TableRowFormatFunction<T> 
fromSerializableFunction(SerializableFunction<T, TableRow> fn) {
         return new TableRowFormatFunction<T>() {
           @Override
           public TableRow apply(T input) {
             return fn.apply(input);
           }
         };
       }
     }
   ```
   hope this could eliminates the long type names incurred by newly introduced 
bifunction, and in case other kind of apply methods needs to be added, this is 
extendable (and not need trifunctions, etc)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to