I have an internal test harness setup that I am using for testing version
upgrade from Druid 0.10.1 to 0.12.2. As part of the testing, I noticed that
executing the same query against the same data sources(on different druid
clusters) gives slightly different results for 0.10.1 and 0.12.2. I have
seen this happen for search, group by, top n, select query types. The
common part in all such queries is that they have a paging spec with
descending set to false.

"pagingSpec": {"pagingIdentifiers": {}, "threshold": 5000}
"desceding": false

My guess is that data distribution is slightly differently within the two
clusters which combined with paging spec is causing this mismatch. Is my
guess correct? If so, is there a way to make such kind of testing
deterministic.

The other thing that I observed is that with doubleSum aggregation type,
0.10.1 is returning values with lower precision (ex - 616346.0) as opposed
to 0.12.1 (ex - 616346.0208094628). Did something change to cause this
change in precision?

Reply via email to