davecromberge commented on code in PR #12164:
URL: https://github.com/apache/pinot/pull/12164#discussion_r1434289093
##########
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/store/StarTreeIndexReader.java:
##########
@@ -121,7 +124,9 @@ private void mapBufferEntries(int starTreeId,
_dataBuffer, ByteOrder.BIG_ENDIAN));
}
// Load metric (function-column pair) forward indexes
- for (AggregationFunctionColumnPair functionColumnPair :
starTreeMetadata.getFunctionColumnPairs()) {
+ TreeMap<AggregationFunctionColumnPair, AggregationSpec>
deduplicatedAggregationSpecs =
Review Comment:
I went back and forth about how best to do this as detailed in the issue and
PR description. A consequence of de-duplicating in the `StarTreeV2Metadata` and
the `StarTreeV2BuilderConfig` is that the metadata for a segment will only
contain the stored aggregation function types, and not any superfluous
aggregation function types. This has implications for whether a segment should
be rebuilt when compared to the existing config (see
`shouldModifyExistingStarTrees` in `StarTreeBuilderUtils`). If both the
metadata and builder config reflect the stored aggregated types then this
should behave correctly.
Finally, it might be surprising to the end user if some of their StarTree
aggregation configuration does not actually land up in metadata - I was
considering adding a warning or validation to the TableConfig. What do you
think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]