gortiz commented on code in PR #12704:
URL: https://github.com/apache/pinot/pull/12704#discussion_r1582735688


##########
pinot-common/src/main/java/org/apache/pinot/common/response/BrokerResponse.java:
##########
@@ -123,6 +127,17 @@ String toJsonString()
    */
   long getMinConsumingFreshnessTimeMs();
 
+  /**
+   * Get the max number of rows seen by a single operator in the query 
processing chain.
+   * <p>
+   * In single-stage this value is equal to {@link 
#getNumEntriesScannedPostFilter()} because single-stage operators
+   * cannot generate more rows than the number of rows received. This is not 
the case in multi-stage operators, where
+   * operators like join or union can emit more rows than the ones received on 
each of its children operators.
+   */
+  default long getMaxRowsInOperator() {

Review Comment:
   ~This is not a new method but a method that existed in 
`BrokerResponseNative` and was inherited by `BrokerResponseV2` . Given it is 
already there (even it doesn't make sense) and it was used to calculate 
`isPartialResponse` I think it is useful to have it in the base interface now 
that `BrokerResponseNativeV2` does not inherit `BrokerResponseNative` anymore.~
   
   ~The alternative would be to do not make `isPartialResponse` `default` and 
include the same code in both `BrokerResponseNative` and 
`BrokerResponseNativeV2`. If you think that is better I can change it, but it 
looks a bit repetitive. For sure currently how this stats are being declared 
doesn't make sense, but we can try to improve that in the future~
   
   I thought this was `isMaxRowsInJoinReached` not `getMaxRowsInOperator`. That 
is why I wrote the two paragraphs above. I'm keeping them just to have the 
context.
   
   The reason to define `getMaxRowsInOperator` is different. We need this 
method here because this is an important stat. This PR adds this value into the 
table that shows stats in Pinot UI and it is easier to assume this exists for 
both V1 and V2 than dealing with its absence. Also, it makes sense to have a 
stat that shows the max number of rows seen in any operator. In V1 that is 
limited by the number of elements returned by a filter, but in V2 that is not 
the case. I think having this attribute in both V1 and V2 is going to make the 
life easier for users who can be trained to just look for this value instead of 
reading `numEntriesScannedPostFilter` in V1 and `maxRowsInOperator` in V2
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to