top n query results

Samarth Jain Sun, 05 Aug 2018 21:41:21 -0700

I have an internal test harness setup that I am using for testing version
upgrade from Druid 0.10.1 to 0.12.2. As part of the testing, I noticed that
executing the same query against the same data source gives slightly
different results for 0.10.1 and 0.12.2. I have seen this happen for
search, group by, top n, select query types. The common part in all such
queries is that they have a paging spec with descending set to false.


"pagingSpec": {"pagingIdentifiers": {}, "threshold": 5000}
"desceding": false

My guess is that the data is distributed slightly differently within the
two clusters which is causing this mismatch. Is my guess correct? If so, is
there a way to make this comparison deterministic.

The other thing that I observed is that with doubleSum aggregation type,
0.10.1 is returning values with lower precision (ex - 616346.0) as opposed
to 0.12.1 (ex - 616346.0208094628). Did something change to cause this
change in precision?

0.10.1 and 0.12.2 group by/search/select/top n query results

Reply via email to