Jackie-Jiang commented on a change in pull request #7243:
URL: https://github.com/apache/pinot/pull/7243#discussion_r682021900
##########
File path:
pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/EmptySegmentPruner.java
##########
@@ -123,6 +123,9 @@ public synchronized void refreshSegment(String segment) {
*/
@Override
public Set<String> prune(BrokerRequest brokerRequest, Set<String> segments) {
+ if (_emptySegments.isEmpty()) {
Review comment:
+1
##########
File path:
pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
##########
@@ -163,48 +181,44 @@ public synchronized void refreshSegment(String segment) {
if (filterExpression == null) {
return segments;
}
- Set<String> selectedSegments = new HashSet<>();
- for (String segment : segments) {
- PartitionInfo partitionInfo = _partitionInfoMap.get(segment);
- if (partitionInfo == null || partitionInfo == INVALID_PARTITION_INFO
|| isPartitionMatch(filterExpression,
- partitionInfo)) {
- selectedSegments.add(segment);
- }
- }
- return selectedSegments;
+ return pruneSegments((partitionInfo, cachedPartitionFunction) ->
isPartitionMatch(filterExpression,
Review comment:
(Critical) This algorithm does not prune based on the input segments,
and will ignore the results for other pruners. Also, we need to keep the
segments with `null` or `INVALID_PARTITION_INFO` because these segments might
not be partitioned
##########
File path:
pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
##########
@@ -163,48 +181,44 @@ public synchronized void refreshSegment(String segment) {
if (filterExpression == null) {
return segments;
}
- Set<String> selectedSegments = new HashSet<>();
- for (String segment : segments) {
- PartitionInfo partitionInfo = _partitionInfoMap.get(segment);
- if (partitionInfo == null || partitionInfo == INVALID_PARTITION_INFO
|| isPartitionMatch(filterExpression,
- partitionInfo)) {
- selectedSegments.add(segment);
- }
- }
- return selectedSegments;
+ return pruneSegments((partitionInfo, cachedPartitionFunction) ->
isPartitionMatch(filterExpression,
+ partitionInfo, cachedPartitionFunction));
} else {
// PQL
FilterQueryTree filterQueryTree =
RequestUtils.generateFilterQueryTree(brokerRequest);
if (filterQueryTree == null) {
return segments;
}
- Set<String> selectedSegments = new HashSet<>();
- for (String segment : segments) {
- PartitionInfo partitionInfo = _partitionInfoMap.get(segment);
- if (partitionInfo == null || partitionInfo == INVALID_PARTITION_INFO
|| isPartitionMatch(filterQueryTree,
- partitionInfo)) {
- selectedSegments.add(segment);
- }
+ return pruneSegments((partitionInfo, cachedPartitionFunction) ->
isPartitionMatch(filterQueryTree, partitionInfo, cachedPartitionFunction));
+ }
+ }
+
+ private Set<String>
pruneSegments(java.util.function.BiFunction<PartitionInfo,
CachedPartitionFunction, Boolean> partitionMatchLambda) {
+ Set<String> selectedSegments = new HashSet<>();
+ CachedPartitionFunction cachedPartitionFunction = new
CachedPartitionFunction();
Review comment:
Why are we creating a cache per query? If we just need to cache the
partition id, `Map<PartitionFunction, Map<String, Integer>>` should be enough
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]