danny0405 commented on code in PR #9876:
URL: https://github.com/apache/hudi/pull/9876#discussion_r1368075473


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieAppendHandle.java:
##########
@@ -652,6 +660,16 @@ private static Map<HeaderMetadataType, String> 
getUpdatedHeader(Map<HeaderMetada
     if (addBlockIdentifier && 
!HoodieTableMetadata.isMetadataTable(config.getBasePath())) { // add block 
sequence numbers only for data table.
       updatedHeader.put(HeaderMetadataType.BLOCK_IDENTIFIER, attemptNumber + 
"," + blockSequenceNumber);
     }
+    if (config.shouldWritePartialUpdates()) {
+      // When enabling writing partial updates to the data blocks, the full 
schema is also written
+      // to the block header so that the reader can differentiate partial 
updates vs schema
+      // evolution, based on the "SCHEMA" which contains the partial schema 
and the "FULL_SCHEMA"
+      // which contains the full schema of the table at this time.
+      updatedHeader.put(
+          HeaderMetadataType.FULL_SCHEMA,
+          HoodieAvroUtils.addMetadataFields(
+              getWriteSchema(config), 
config.allowOperationMetadataField()).toString());

Review Comment:
   I'm fine if we deem the partial update as a sepecial use case for schema 
evolution, then the full schema should always be the current latest table 
schema, we should not encode it in the log block because it is evolved 
automically.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to