n3nash commented on a change in pull request #2927: URL: https://github.com/apache/hudi/pull/2927#discussion_r630724709
########## File path: hudi-common/src/main/java/org/apache/hudi/common/table/TableSchemaResolver.java ########## @@ -353,6 +361,91 @@ public static boolean isSchemaCompatible(String oldSchema, String newSchema) { return isSchemaCompatible(new Schema.Parser().parse(oldSchema), new Schema.Parser().parse(newSchema)); } + /** + * Get latest schema either from incoming schema or table schema. + * @param incomingSchema incoming batch's schema. + * @param convertTableSchemaToAddNamespace {@code true} if table schema needs to be converted. {@code false} otherwise. + * @param converterFn converter function to be called over table schema. In DeltaSync flow, table schema needs to convert + * from avro -> df -> avro to add the namespace in the schema. But in spark writer flow, no such conversion is required. + * This package does not have access to some elements needed for conversion, hence added it as function call rather than embedding here. + * @return the latest schema. + */ + public Schema getLatestSchema(Schema incomingSchema, boolean convertTableSchemaToAddNamespace, + Function1<Schema, Schema> converterFn) { + Schema latestSchema = incomingSchema; + try { + if (isTimelineNonEmpty()) { + Schema tableSchema = getTableAvroSchemaWithoutMetadataFields(); + if (convertTableSchemaToAddNamespace) { + tableSchema = converterFn.apply(tableSchema); + } + if (incomingSchema.getFields().size() < tableSchema.getFields().size() && isSchemaSubset(tableSchema, incomingSchema)) { Review comment: What if a nested field has been added to the new incoming schema, in that case does incomingSchema.getFields() count it ? Basically, what level of information does .getFields() return ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org