gianm commented on issue #8433: StringDictionaryEncodedColumn dimSelector to 
return CARDINALITY_UNKNOWN with extractionFn
URL: https://github.com/apache/druid/pull/8433#issuecomment-579581765
 
 
   Hmm, I came across this patch while looking at the javadocs for 
`DimensionSelector#getValueCardinality`. It doesn't match my understanding, 
which is that the ids for a selector _don't_ have to be ordered (the ids for a 
'real' column have to be, but that's different, and mostly only matters for 
filtering, which uses a different `BitmapIndexSelector` interface).
   
   @himanshug, @clintropolis, any idea if anything currently assumes / take 
advantage of ids being sorted in selectors?
   
   If not, I suggest we update `DimensionSelector` javadocs to say that the ids 
are not guaranteed to be sorted.
   
   But if something does want to take advantage of sortedness, I suggest we add 
a new method like `DimensionSelector#isDictionarySorted` that tells callers if 
they can assume the ids are sorted or not.
   
   In both cases I am suggesting we keep allowing the current behavior, because 
allowing decorative selectors to retain the underlying dictionary enables 
evaluation of exprs and extraction fns on single string columns to be deferred 
until later in the query processing pipeline (after per-segment aggregation is 
complete and it's time to merge results across segments).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to