AlexanderSaydakov commented on issue #7486: regression: quantilesDoublesSketch 
returns null instead of empty array 
URL: 
https://github.com/apache/incubator-druid/issues/7486#issuecomment-485914210
 
 
   Are you referring to the finalizeComputation() method as the "default" state?
   This notion might make more or less sense for different sketches. Let's say, 
for distinct counting sketches like Theta or HLL this makes perfect sense: it 
must be the estimate of the distinct count. It makes much less sense for 
quantiles sketch that represents an arbitrary distribution of some values. We 
decided that the most sensible single number for this purpose would be the 
total weight of the distribution (also known as the stream length or the number 
of values, if they all have the weight of 1).
   Now consider an empty quantiles sketch. Nothing is known about the 
distribution. The total weight is zero, so there is no problem with this 
questionable finalizeComputation(), but we cannot say anything at all about 
quantiles or ranks. What is the min value, median or max or any other quantile? 
If a single quantile is requested, then the best answer must be NaN, not zero 
since zero is a perfectly good number and would be deeply misleading. What to 
do if an array of quantiles is requested? Options are: null, empty array and 
array of NaNs (the same number as in the request). We used to do the latter in 
the core library, but at some point (between versions 0.10.3 and 0.11.0) we 
decided that it did not make much sense to spend resources in the core library 
to create this array and fill it with NaN values. We decided to return null and 
let the user decide what to do in the presentation layer. So let's decide what 
do we want Druid to return from such a post-agg.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to