Thanks for the explanation, Gian.

I looked closer and the different results for the group by query turned out
to be a case of druid segments being temporarily unavailable. Under load,
some of the the historicals ran into long gc pauses causing them to lose zk
connection and fall out of the cluster. When things were stable, both
0.10.1 and 0.12.2 yielded same responses for the group by queries.



On Mon, Aug 6, 2018 at 7:49 AM, Gian Merlino <g...@apache.org> wrote:

> Hi Samarth,
>
> The doubleSum difference is likely due to the fact that before 0.11.0,
> Druid read values out of columns as 32 bit floats and then cast them to 64
> bit doubles. Now it can read them directly as 64 bit doubles. And actually,
> it can _store_ floating point values as 64 bit doubles too, although this
> won't be enabled by default until 0.13.0 (see
> http://druid.io/docs/latest/configuration/index.html#double-column-storage
> for how to enable it today).
>
> Some thoughts on specific query types:
>
> - The ordering of select results can vary due to differing choices about
> which segments to read first. The results will stay in time order, but two
> results with the same timestamp might swap positions. Btw, if you don't
> need the strict time ordering guarantees, consider Scan queries (
> http://druid.io/docs/latest/querying/scan-query.html) which are much
> lighter in terms of memory usage.
> - The exact ranking and values of TopN results can also vary, since topNs
> are approximate and their results can vary based on which segments are
> processed in which order and on which servers.
> - GroupBy I would not expect to vary: what kinds of differences are you
> seeing there?
> - Search I'm not familiar with enough to think of a reason why it should or
> shouldn't vary.
>
> One thing you can do to try to get more consistent results for comparison
> is add "bySegment" : true to your context. This will skip the merging step,
> and just return sub-results for each segment individually. Most of the
> potential variation is introduced in the merging step, so this should give
> you more consistent results. With the caveat that it means you won't be
> getting to test the merging step.
>
> On Sun, Aug 5, 2018 at 10:55 PM Samarth Jain <samarth.j...@gmail.com>
> wrote:
>
> > I have an internal test harness setup that I am using for testing version
> > upgrade from Druid 0.10.1 to 0.12.2. As part of the testing, I noticed
> that
> > executing the same query against the same data sources(on different druid
> > clusters) gives slightly different results for 0.10.1 and 0.12.2. I have
> > seen this happen for search, group by, top n, select query types. The
> > common part in all such queries is that they have a paging spec with
> > descending set to false.
> >
> > "pagingSpec": {"pagingIdentifiers": {}, "threshold": 5000}
> > "desceding": false
> >
> > My guess is that data distribution is slightly differently within the two
> > clusters which combined with paging spec is causing this mismatch. Is my
> > guess correct? If so, is there a way to make such kind of testing
> > deterministic.
> >
> > The other thing that I observed is that with doubleSum aggregation type,
> > 0.10.1 is returning values with lower precision (ex - 616346.0) as
> opposed
> > to 0.12.1 (ex - 616346.0208094628). Did something change to cause this
> > change in precision?
> >
>

Reply via email to