Re: Different query results for 0.12.2 and 0.10.1

2018-08-06 Thread Gian Merlino
Hi Samarth,

The doubleSum difference is likely due to the fact that before 0.11.0,
Druid read values out of columns as 32 bit floats and then cast them to 64
bit doubles. Now it can read them directly as 64 bit doubles. And actually,
it can _store_ floating point values as 64 bit doubles too, although this
won't be enabled by default until 0.13.0 (see
http://druid.io/docs/latest/configuration/index.html#double-column-storage
for how to enable it today).

Some thoughts on specific query types:

- The ordering of select results can vary due to differing choices about
which segments to read first. The results will stay in time order, but two
results with the same timestamp might swap positions. Btw, if you don't
need the strict time ordering guarantees, consider Scan queries (
http://druid.io/docs/latest/querying/scan-query.html) which are much
lighter in terms of memory usage.
- The exact ranking and values of TopN results can also vary, since topNs
are approximate and their results can vary based on which segments are
processed in which order and on which servers.
- GroupBy I would not expect to vary: what kinds of differences are you
seeing there?
- Search I'm not familiar with enough to think of a reason why it should or
shouldn't vary.

One thing you can do to try to get more consistent results for comparison
is add "bySegment" : true to your context. This will skip the merging step,
and just return sub-results for each segment individually. Most of the
potential variation is introduced in the merging step, so this should give
you more consistent results. With the caveat that it means you won't be
getting to test the merging step.

On Sun, Aug 5, 2018 at 10:55 PM Samarth Jain  wrote:

> I have an internal test harness setup that I am using for testing version
> upgrade from Druid 0.10.1 to 0.12.2. As part of the testing, I noticed that
> executing the same query against the same data sources(on different druid
> clusters) gives slightly different results for 0.10.1 and 0.12.2. I have
> seen this happen for search, group by, top n, select query types. The
> common part in all such queries is that they have a paging spec with
> descending set to false.
>
> "pagingSpec": {"pagingIdentifiers": {}, "threshold": 5000}
> "desceding": false
>
> My guess is that data distribution is slightly differently within the two
> clusters which combined with paging spec is causing this mismatch. Is my
> guess correct? If so, is there a way to make such kind of testing
> deterministic.
>
> The other thing that I observed is that with doubleSum aggregation type,
> 0.10.1 is returning values with lower precision (ex - 616346.0) as opposed
> to 0.12.1 (ex - 616346.0208094628). Did something change to cause this
> change in precision?
>


Different query results for 0.12.2 and 0.10.1

2018-08-05 Thread Samarth Jain
I have an internal test harness setup that I am using for testing version
upgrade from Druid 0.10.1 to 0.12.2. As part of the testing, I noticed that
executing the same query against the same data sources(on different druid
clusters) gives slightly different results for 0.10.1 and 0.12.2. I have
seen this happen for search, group by, top n, select query types. The
common part in all such queries is that they have a paging spec with
descending set to false.

"pagingSpec": {"pagingIdentifiers": {}, "threshold": 5000}
"desceding": false

My guess is that data distribution is slightly differently within the two
clusters which combined with paging spec is causing this mismatch. Is my
guess correct? If so, is there a way to make such kind of testing
deterministic.

The other thing that I observed is that with doubleSum aggregation type,
0.10.1 is returning values with lower precision (ex - 616346.0) as opposed
to 0.12.1 (ex - 616346.0208094628). Did something change to cause this
change in precision?