klsince commented on code in PR #13285:
URL: https://github.com/apache/pinot/pull/13285#discussion_r1631532746


##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/upsert/BasePartitionUpsertMetadataManager.java:
##########
@@ -832,8 +836,10 @@ public void takeSnapshot() {
     if (!_enableSnapshot) {
       return;
     }
-    if (!_gotFirstConsumingSegment) {
-      _logger.info("Skip taking snapshot before getting the first consuming 
segment");
+    if (_partialUpsertHandler == null && !_gotFirstConsumingSegment) {

Review Comment:
   I understood that it's safe to take snapshot for partial upsert table before 
_gotFirstConsumingSegment, but wondering what's the benefits of taking snapshot 
eagerly for it?



##########
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeSegmentDataManager.java:
##########
@@ -703,9 +703,20 @@ public void run() {
         //   persisted.
         // Take upsert snapshot before starting consuming events
         if (_partitionUpsertMetadataManager != null) {
-          _partitionUpsertMetadataManager.takeSnapshot();
-          // If upsertTTL is enabled, we will remove expired primary keys from 
upsertMetadata after taking snapshot.
-          _partitionUpsertMetadataManager.removeExpiredPrimaryKeys();
+          if (_tableConfig.getUpsertMetadataTTL() > 0) {
+            // If upsertMetadataTTL is enabled, we will remove expired primary 
keys from upsertMetadata
+            // AFTER taking a snapshot. Taking the snapshot first is crucial 
to ensure we capture the final
+            // state of a particular key before it exits the TTL window.

Review Comment:
   curious why it is critical to take snapshot first? The states of those 
particular keys will be gone in the next snapshot, and would that cause issue?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to