satishkotha commented on a change in pull request #1320: [HUDI-571] Add min/max 
headers on archived files
URL: https://github.com/apache/incubator-hudi/pull/1320#discussion_r378485704
 
 

 ##########
 File path: 
hudi-client/src/main/java/org/apache/hudi/io/HoodieCommitArchiveLog.java
 ##########
 @@ -268,6 +270,19 @@ public Path getArchiveFilePath() {
     return archiveFilePath;
   }
 
+  private void writeHeaderBlock(Schema wrapperSchema, List<HoodieInstant> 
instants) throws Exception {
+    if (!instants.isEmpty()) {
+      Collections.sort(instants, HoodieInstant.COMPARATOR);
+      HoodieInstant minInstant = instants.get(0);
+      HoodieInstant maxInstant = instants.get(instants.size() - 1);
+      Map<HeaderMetadataType, String> metadataMap = Maps.newHashMap();
+      metadataMap.put(HeaderMetadataType.SCHEMA, wrapperSchema.toString());
+      metadataMap.put(HeaderMetadataType.MIN_INSTANT_TIME, 
minInstant.getTimestamp());
+      metadataMap.put(HeaderMetadataType.MAX_INSTANT_TIME, 
maxInstant.getTimestamp());
+      this.writer.appendBlock(new HoodieAvroDataBlock(Collections.emptyList(), 
metadataMap));
+    }
+  }
+
   private void writeToFile(Schema wrapperSchema, List<IndexedRecord> records) 
throws Exception {
 
 Review comment:
   I've included decision for including header block above. Let me know. file 
is closed after archiving all instants that qualify. So i think file can grow 
is not a issue. Correct me if i'm reading this wrong. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to