vinishjail97 commented on code in PR #9482: URL: https://github.com/apache/hudi/pull/9482#discussion_r1300451056
########## hudi-gcp/src/main/java/org/apache/hudi/gcp/bigquery/HoodieBigQuerySyncClient.java: ########## @@ -147,6 +159,31 @@ public void createManifestTable(String tableName, String sourceUri) { } } + /** + * Updates the schema for the given table if the schema has changed. + * @param tableName name of the table in BigQuery + * @param schema latest schema for the table + */ + public void updateTableSchema(String tableName, Schema schema, List<String> partitionFields) { + Table existingTable = bigquery.getTable(TableId.of(projectId, datasetName, tableName)); + ExternalTableDefinition definition = existingTable.getDefinition(); + Schema remoteTableSchema = definition.getSchema(); + // Add the partition fields into the schema to avoid conflicts while updating + List<Field> updatedTableFields = remoteTableSchema.getFields().stream() + .filter(field -> partitionFields.contains(field.getName())) + .collect(Collectors.toList()); + updatedTableFields.addAll(schema.getFields()); + Schema finalSchema = Schema.of(updatedTableFields); + if (definition.getSchema() != null && definition.getSchema().equals(finalSchema)) { + return; // No need to update schema. + } + Table updatedTable = existingTable.toBuilder() Review Comment: Clarification: We are creating the table by providing a schema without partition fields, but updating it with the schema by adding the partition fields, are we sure it's the right thing ? Without manifests and using ExternalTableDefinition, we use the schema without partition fields. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org