mcvsubbu commented on a change in pull request #3979: Track Indexed timestamp 
across consuming segments
URL: https://github.com/apache/incubator-pinot/pull/3979#discussion_r266569347
 
 

 ##########
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
 ##########
 @@ -204,20 +205,22 @@ public boolean index(GenericRow row) {
     // If metrics aggregation is enabled and if the dimension values were 
already seen, this will return existing docId,
     // else this will return a new docId.
     int docId = getOrCreateDocId(dictIdMap);
-
+    boolean canTakeMore = false;
     // docId == numDocs implies new docId.
     if (docId == numDocs) {
       // Add forward and inverted indices for new document.
       addForwardIndex(row, docId, dictIdMap);
       addInvertedIndex(docId, dictIdMap);
       // Update number of document indexed at last to make the latest record 
queryable
-      return _numDocsIndexed++ < _capacity;
+      canTakeMore = _numDocsIndexed++ < _capacity;
     } else {
-      Preconditions
-          .checkState(_aggregateMetrics, "Invalid document-id during indexing: 
" + docId + " expected: " + numDocs);
+      Preconditions.checkState(_aggregateMetrics, "Invalid document-id during 
indexing: " + docId + " expected: " + numDocs);
       // Update metrics for existing document.
-      return aggregateMetrics(row, docId);
+      canTakeMore = aggregateMetrics(row, docId);
     }
+    // update indexing time
+    _lastIndexedTimestamp = System.currentTimeMillis();
 
 Review comment:
   Issues with indexing the row should be dealt with as a bug and fixed, since 
that is a loss of data. 
   
   Also, I was proposing to extend the interface so that a stream that keeps 
track of timestamps of incoming messages (for its own SLA) can use the 
mechanism. I was not referring to the time column of the schema
   
   We can chat a little to see if any of these ideas can be explored further 
before discarding

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to