tibrewalpratik17 commented on code in PR #13285: URL: https://github.com/apache/pinot/pull/13285#discussion_r1634413636
########## pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManager.java: ########## @@ -703,9 +703,20 @@ public void run() { // persisted. // Take upsert snapshot before starting consuming events if (_partitionUpsertMetadataManager != null) { - _partitionUpsertMetadataManager.takeSnapshot(); - // If upsertTTL is enabled, we will remove expired primary keys from upsertMetadata after taking snapshot. - _partitionUpsertMetadataManager.removeExpiredPrimaryKeys(); + if (_tableConfig.getUpsertMetadataTTL() > 0) { + // If upsertMetadataTTL is enabled, we will remove expired primary keys from upsertMetadata + // AFTER taking a snapshot. Taking the snapshot first is crucial to ensure we capture the final + // state of a particular key before it exits the TTL window. Review Comment: > it seems like taking snapshot after removing metadata out of the TTL wouldn’t affect data/query correctness but incurred some extra overhead if server restarted before taking new snapshot as it had to add metadata (already out of TTL) back to Map, however those metadata would be removed again. Umm actually no, if a segment is out-of-TTL window during server restart then we don't follow the `doAddSegment` flow for them. So it should be a data correctness issue afaiu. ``` if (maxComparisonValue.doubleValue() < _largestSeenComparisonValue.get() - _metadataTTL) { _logger.info("Skip adding segment: {} because it's out of TTL", segmentName); MutableRoaringBitmap validDocIdsSnapshot = immutableSegment.loadValidDocIdsFromSnapshot(); if (validDocIdsSnapshot != null) { MutableRoaringBitmap queryableDocIds = getQueryableDocIds(segment, validDocIdsSnapshot); immutableSegment.enableUpsert(this, new ThreadSafeMutableRoaringBitmap(validDocIdsSnapshot), queryableDocIds != null ? new ThreadSafeMutableRoaringBitmap(queryableDocIds) : null); } else { _logger.warn("Failed to find snapshot from segment: {} which is out of TTL, treating all documents as valid", segmentName); } return; } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org