klsince commented on code in PR #13285: URL: https://github.com/apache/pinot/pull/13285#discussion_r1634137440
########## pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManager.java: ########## @@ -703,9 +703,20 @@ public void run() { // persisted. // Take upsert snapshot before starting consuming events if (_partitionUpsertMetadataManager != null) { - _partitionUpsertMetadataManager.takeSnapshot(); - // If upsertTTL is enabled, we will remove expired primary keys from upsertMetadata after taking snapshot. - _partitionUpsertMetadataManager.removeExpiredPrimaryKeys(); + if (_tableConfig.getUpsertMetadataTTL() > 0) { + // If upsertMetadataTTL is enabled, we will remove expired primary keys from upsertMetadata + // AFTER taking a snapshot. Taking the snapshot first is crucial to ensure we capture the final + // state of a particular key before it exits the TTL window. Review Comment: I see, that makes sense. Based on the metadata TTL related code, it seems like taking snapshot after removing metadata out of the TTL wouldn’t affect data/query correctness but incurred some extra overhead if server restarted before taking new snapshot as it had to add metadata (already out of TTL) back to Map, however those metadata would be removed again. IIUC, maybe update the comment a bit that this is mainly to avoid such overhead upon unexpected server restart, but not affecting correctness. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org