gianm commented on issue #8433: StringDictionaryEncodedColumn dimSelector to return CARDINALITY_UNKNOWN with extractionFn URL: https://github.com/apache/druid/pull/8433#issuecomment-579581765 Hmm, I came across this patch while looking at the javadocs for `DimensionSelector#getValueCardinality`. It doesn't match my understanding, which is that the ids for a selector _don't_ have to be ordered (the ids for a 'real' column have to be, but that's different, and mostly only matters for filtering, which uses a different `BitmapIndexSelector` interface). @himanshug, @clintropolis, any idea if anything currently assumes / take advantage of ids being sorted in selectors? If not, I suggest we update `DimensionSelector` javadocs to say that the ids are not guaranteed to be sorted. But if something does want to take advantage of sortedness, I suggest we add a new method like `DimensionSelector#isDictionarySorted` that tells callers if they can assume the ids are sorted or not. In both cases I am suggesting we keep allowing the current behavior, because allowing decorative selectors to retain the underlying dictionary enables evaluation of exprs and extraction fns on single string columns to be deferred until later in the query processing pipeline (after per-segment aggregation is complete and it's time to merge results across segments).
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
