rohan-uptycs commented on code in PR #8503:
URL: https://github.com/apache/hudi/pull/8503#discussion_r1179895882


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java:
##########
@@ -509,7 +509,15 @@ private Stream<HoodieInstant> getCommitInstantsToArchive() 
throws IOException {
   }
 
   private Stream<HoodieInstant> getInstantsToArchive() throws IOException {
-    Stream<HoodieInstant> instants = 
Stream.concat(getCleanInstantsToArchive(), getCommitInstantsToArchive());
+    List<HoodieInstant> commitInstantsToArchive = 
getCommitInstantsToArchive().collect(Collectors.toList());
+    Stream<HoodieInstant> instants = 
Stream.concat(getCleanInstantsToArchive(), commitInstantsToArchive.stream());
+    HoodieInstant hoodieOldestInstantToArchive = 
commitInstantsToArchive.stream().max(Comparator.comparing(maxInstant -> 
maxInstant.getTimestamp())).orElse(null);
+    /**
+     * if hoodieOldestInstantToArchive is null that means nothing is getting 
archived, so no need to update metadata
+     */
+    if (hoodieOldestInstantToArchive != null) {
+      table.getIndex().updateMetadata(table, 
Option.of(hoodieOldestInstantToArchive));

Review Comment:
   @SteNicholas , Yeah it can be invoked but i see few problems with it 
   What if underlying file system is down and **updateMetadata** fails to sync 
metadata, then there is no  mechanism to bring it in sync with latest committed 
metadata, and archival will remove replace commit eventually and it will end up 
in an inconsistent state.
   On the other hand in **archival process , it will be eventually in sync with 
committed metadata**  before replace commit getting archived.
   I think consistent hashing metadata has strong dependency on archival 
process, As it is dependent on active timeline replaced commit to load metadata.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to