[GitHub] [hudi] danny0405 commented on a diff in pull request #7561: [HUDI-5477][DO NOT MERGE] Optimize timeline loading in Hudi sync client

2022-12-27 Thread GitBox


danny0405 commented on code in PR #7561:
URL: https://github.com/apache/hudi/pull/7561#discussion_r1058040290


##
hudi-common/src/main/java/org/apache/hudi/common/table/HoodieTableMetaClient.java:
##
@@ -385,21 +386,44 @@ public HoodieMetastoreConfig getMetastoreConfig() {
   }
 
   /**
-   * Returns fresh new archived commits as a timeline from startTs (inclusive).
-   *
-   * This is costly operation if really early endTs is specified.
-   * Be caution to use this only when the time range is short.
-   *
-   * This method is not thread safe.
+   * Returns the cached archived timeline from startTs (inclusive).
*
-   * @return Archived commit timeline
+   * @param startTs The start instant time (inclusive) of the archived 
timeline.
+   * @return the archived timeline.
*/
   public HoodieArchivedTimeline getArchivedTimeline(String startTs) {
-return new HoodieArchivedTimeline(this, startTs);
+return getArchivedTimeline(startTs, true);
+  }
+
+  /**
+   * Returns the cached archived timeline if using in-memory cache or a fresh 
new archived
+   * timeline if not using cache, from startTs (inclusive).
+   * 
+   * Instantiating an archived timeline is costly operation if really early 
startTs is
+   * specified.
+   * 
+   * This method is not thread safe.
+   *
+   * @param startTs  The start instant time (inclusive) of the archived 
timeline.
+   * @param useCache Whether to use in-memory cache.
+   * @return the archived timeline based on the arguments.
+   */
+  public HoodieArchivedTimeline getArchivedTimeline(String startTs, boolean 
useCache) {
+if (useCache) {
+  return archivedTimelineMap.computeIfAbsent(startTs, 
this::instantiateArchivedTimeline);

Review Comment:
   When the cache is cleared ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [hudi] danny0405 commented on a diff in pull request #7561: [HUDI-5477][DO NOT MERGE] Optimize timeline loading in Hudi sync client

2022-12-27 Thread GitBox


danny0405 commented on code in PR #7561:
URL: https://github.com/apache/hudi/pull/7561#discussion_r1058040183


##
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/TimelineUtils.java:
##
@@ -210,11 +210,30 @@ public static HoodieDefaultTimeline 
getTimeline(HoodieTableMetaClient metaClient
 return activeTimeline;
   }
 
+  /**
+   * Returns a Hudi timeline with commits after the given instant time 
(exclusive).
+   *
+   * @param metaClient{@link HoodieTableMetaClient} instance.
+   * @param exclusiveStartInstantTime Start instant time (exclusive).
+   * @return Hudi timeline.
+   */
+  public static HoodieTimeline getCommitsTimelineAfter(
+  HoodieTableMetaClient metaClient, String exclusiveStartInstantTime) {
+HoodieActiveTimeline activeTimeline = metaClient.getActiveTimeline();
+HoodieDefaultTimeline timeline =
+activeTimeline.isBeforeTimelineStarts(exclusiveStartInstantTime)
+? metaClient.getArchivedTimeline(exclusiveStartInstantTime)
+.mergeTimeline(activeTimeline)
+: activeTimeline;
+return timeline.getCommitsTimeline()
+.findInstantsAfter(exclusiveStartInstantTime, Integer.MAX_VALUE);
+  }

Review Comment:
   Only the active timeline needs to be filtered by the start instant time, 
i.e. the invocation `#findInstantsAfter`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org