gortiz commented on code in PR #8979:
URL: https://github.com/apache/pinot/pull/8979#discussion_r979763121
##########
pinot-core/src/main/java/org/apache/pinot/core/operator/combine/MinMaxValueBasedSelectionOrderByCombineOperator.java:
##########
@@ -211,30 +211,51 @@ protected void processSegments() {
((AcquireReleaseColumnsSegmentOperator) operator).release();
}
}
- PriorityQueue<Object[]> selectionResult = (PriorityQueue<Object[]>)
resultsBlock.getRows();
- if (selectionResult != null && selectionResult.size() == _numRowsToKeep)
{
- // Segment result has enough rows, update the boundary value
- assert selectionResult.peek() != null;
- Comparable segmentBoundaryValue = (Comparable)
selectionResult.peek()[0];
- if (boundaryValue == null) {
- boundaryValue = segmentBoundaryValue;
- } else {
- if (asc) {
- if (segmentBoundaryValue.compareTo(boundaryValue) < 0) {
- boundaryValue = segmentBoundaryValue;
- }
- } else {
- if (segmentBoundaryValue.compareTo(boundaryValue) > 0) {
- boundaryValue = segmentBoundaryValue;
- }
- }
- }
+ Collection<Object[]> rows = resultsBlock.getRows();
+ if (rows != null && rows.size() >= _numRowsToKeep) {
+ boundaryValue = updateBoundaryValue(boundaryValue,
extractBoundaryValue(rows), asc);
}
threadBoundaryValue = boundaryValue;
_blockingQueue.offer(resultsBlock);
}
}
+ private Comparable extractBoundaryValue(Collection<Object[]> rows) {
+ if (rows instanceof PriorityQueue) {
+ PriorityQueue<Object[]> selectionResult = (PriorityQueue<Object[]>) rows;
+ assert selectionResult.peek() != null;
+ // at this point, priority queues are sorted in the inverse order
+ return (Comparable) selectionResult.peek()[0];
+ }
+ List<Object[]> selectionResult;
+ if (rows instanceof List) {
+ selectionResult = (List<Object[]>) rows;
+ } else {
+ LOGGER.warn("Selection result stored in an unexpected data structure of
type {}. A copy is needed",
Review Comment:
We are in the paranoid territory here but... most changes I can imagine (for
example, adding a new operation that stores rows in a Set) wouldn't break in
this situation. But both solutions (falling here or just logging) are not
actually that good. What I'm going to do is to attack the cause of the problem
and change `SelectionResultsBlock` to only support Lists and PriorityQueues.
Therefore is some time in the future we add a new operation datatype, the type
system will help us to note that some `SelectionResultsBlock` consumers may
assume that rows cannot be of any Collection subclass
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]