Re: Different query results for 0.12.2 and 0.10.1

2018-08-06 Thread Samarth Jain
Thanks for the explanation, Gian.

I looked closer and the different results for the group by query turned out
to be a case of druid segments being temporarily unavailable. Under load,
some of the the historicals ran into long gc pauses causing them to lose zk
connection and fall out of the cluster. When things were stable, both
0.10.1 and 0.12.2 yielded same responses for the group by queries.



On Mon, Aug 6, 2018 at 7:49 AM, Gian Merlino  wrote:

> Hi Samarth,
>
> The doubleSum difference is likely due to the fact that before 0.11.0,
> Druid read values out of columns as 32 bit floats and then cast them to 64
> bit doubles. Now it can read them directly as 64 bit doubles. And actually,
> it can _store_ floating point values as 64 bit doubles too, although this
> won't be enabled by default until 0.13.0 (see
> http://druid.io/docs/latest/configuration/index.html#double-column-storage
> for how to enable it today).
>
> Some thoughts on specific query types:
>
> - The ordering of select results can vary due to differing choices about
> which segments to read first. The results will stay in time order, but two
> results with the same timestamp might swap positions. Btw, if you don't
> need the strict time ordering guarantees, consider Scan queries (
> http://druid.io/docs/latest/querying/scan-query.html) which are much
> lighter in terms of memory usage.
> - The exact ranking and values of TopN results can also vary, since topNs
> are approximate and their results can vary based on which segments are
> processed in which order and on which servers.
> - GroupBy I would not expect to vary: what kinds of differences are you
> seeing there?
> - Search I'm not familiar with enough to think of a reason why it should or
> shouldn't vary.
>
> One thing you can do to try to get more consistent results for comparison
> is add "bySegment" : true to your context. This will skip the merging step,
> and just return sub-results for each segment individually. Most of the
> potential variation is introduced in the merging step, so this should give
> you more consistent results. With the caveat that it means you won't be
> getting to test the merging step.
>
> On Sun, Aug 5, 2018 at 10:55 PM Samarth Jain 
> wrote:
>
> > I have an internal test harness setup that I am using for testing version
> > upgrade from Druid 0.10.1 to 0.12.2. As part of the testing, I noticed
> that
> > executing the same query against the same data sources(on different druid
> > clusters) gives slightly different results for 0.10.1 and 0.12.2. I have
> > seen this happen for search, group by, top n, select query types. The
> > common part in all such queries is that they have a paging spec with
> > descending set to false.
> >
> > "pagingSpec": {"pagingIdentifiers": {}, "threshold": 5000}
> > "desceding": false
> >
> > My guess is that data distribution is slightly differently within the two
> > clusters which combined with paging spec is causing this mismatch. Is my
> > guess correct? If so, is there a way to make such kind of testing
> > deterministic.
> >
> > The other thing that I observed is that with doubleSum aggregation type,
> > 0.10.1 is returning values with lower precision (ex - 616346.0) as
> opposed
> > to 0.12.1 (ex - 616346.0208094628). Did something change to cause this
> > change in precision?
> >
>


Re: Changing release process for release candidates

2018-08-06 Thread Julian Hyde
If it’s not clear, “release manager” is an important role. The RM is 
responsible for the key events in the release - building a release candidate 
(including signatures), putting it in a place that it can be viewed, writing 
release notes, starting a vote, closing the vote, writing the release 
announcement.

There MUST be a vote on the release, but for other parts of the process it’s 
best if the the RM doesn’t hang around waiting for permission.



> On Aug 6, 2018, at 4:52 PM, Gian Merlino  wrote:
> 
> It sounds good to me, it streamlines things a bit and seems to be what
> other projects are doing. As Julian pointed out in the other thread it
> still pays to have someone "managing" the release and to have some
> discussion about when's the right time to start a release branch. The
> "release manager" job has rotated through a few different people over the
> past few major releases, which is good.
> 
> On Mon, Aug 6, 2018 at 3:20 PM Jihoon Son  wrote:
> 
>> Hi all,
>> 
>> Our current release process for RCs begins with a vote. It usually takes up
>> a few days, but is actually not a mandatory process for creating RCs. If we
>> can reach consensus without explicit votes, we can expect the faster
>> release in the future.
>> 
>> The original discussion is available at
>> 
>> https://lists.apache.org/thread.html/d887f0c6e23f1625e549389c08a9a5e74a7a24db4d5e007b6e8d10f6@%3Cdev.druid.apache.org%3E
>> .
>> 
>> Welcome any idea.
>> 
>> Best,
>> Jihoon
>> 


-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org



Re: Changing release process for release candidates

2018-08-06 Thread Gian Merlino
It sounds good to me, it streamlines things a bit and seems to be what
other projects are doing. As Julian pointed out in the other thread it
still pays to have someone "managing" the release and to have some
discussion about when's the right time to start a release branch. The
"release manager" job has rotated through a few different people over the
past few major releases, which is good.

On Mon, Aug 6, 2018 at 3:20 PM Jihoon Son  wrote:

> Hi all,
>
> Our current release process for RCs begins with a vote. It usually takes up
> a few days, but is actually not a mandatory process for creating RCs. If we
> can reach consensus without explicit votes, we can expect the faster
> release in the future.
>
> The original discussion is available at
>
> https://lists.apache.org/thread.html/d887f0c6e23f1625e549389c08a9a5e74a7a24db4d5e007b6e8d10f6@%3Cdev.druid.apache.org%3E
> .
>
> Welcome any idea.
>
> Best,
> Jihoon
>


Re: Druid 0.12.2 release vote

2018-08-06 Thread Fangjin Yang
+1

On Mon, Aug 6, 2018 at 3:03 PM, Jihoon Son  wrote:

> Hi all,
>
> Druid 0.12.2-rc1 (http://druid.io/downloads.html) is available now, and I
> think it's time to vote on the 0.12.2 release. Please note that 0.12.2 is
> not an ASF release.
>
> Here is my +1.
>
> Best,
> Jihoon
>


Re: About creating 0.12.2-rc1

2018-08-06 Thread Jihoon Son
Sure. Please check
https://lists.apache.org/thread.html/b88a508d31fb18e951c6cb556e25177436f392a49413e031e2dab3c4@%3Cdev.druid.apache.org%3E
.

Best,
Jihoon

On Mon, Aug 6, 2018 at 10:37 AM Gian Merlino  wrote:

> Thanks Jihoon! Would you mind starting the other thread too when you get a
> chance?
>
> On Mon, Aug 6, 2018 at 10:13 AM Jihoon Son  wrote:
>
> > Thanks guys.
> >
> > I'm creating 0.12.2-rc1 now.
> >
> > Regarding creating an RC without vote, I think it's worth to have a
> > discussion in another thread to make sure everyone knows about the new RC
> > release process.
> >
> > Best,
> > Jihoon
> >
> > On Sun, Aug 5, 2018 at 10:30 AM Julian Hyde 
> > wrote:
> >
> > > Gian is correct. Creating an RC doesn’t require a vote. It does
> require a
> > > release manager. Usually in Calcite we determine the timeframe of the
> > > release, and choose an RM, by a discussion that reaches consensus
> without
> > > an explicit vote.
> > >
> > > The RM may do a little “traffic control”, asking whether people
> consider
> > > the branch is in good shape, and perhaps asking people to stop pushing,
> > > again by a non-vote email thread.
> > >
> > > Julian
> > >
> > > > On Aug 5, 2018, at 8:56 AM, Gian Merlino  wrote:
> > > >
> > > > +1, and fwiw, it looks like Apache projects don't always need to do
> > votes
> > > > for creating release candidates. For example on the Calcite mailing
> > list
> > > I
> > > > see votes for _final_ releases, but the release candidates seem to be
> > > > created and uploaded without a vote. There is generally some
> discussion
> > > on
> > > > the list about whether it's a good time to do a release candidate,
> but
> > I
> > > > don't generally see formal votes. I think something similar could
> work
> > > for
> > > > us in the future and could help us get releases out quicker.
> > > >
> > > > On Fri, Aug 3, 2018 at 9:38 PM Prashant Deva <
> prashant.d...@gmail.com>
> > > > wrote:
> > > >
> > > >> +1
> > > >> Prashant
> > > >>
> > > >>
> > > >> On Fri, Aug 3, 2018 at 7:11 PM Niketh Sabbineni <
> > > >> niketh.sabbin...@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> +1
> > > >>>
> > > >>> Looking forward to this
> > > >>>
> > >  On Fri, Aug 3, 2018 at 7:09 PM Jihoon Son 
> > > wrote:
> > > 
> > >  Hi folks,
> > > 
> > >  Releasing 0.12.2 has been delayed because, fortunately, we could
> > find
> > > >>> more
> > >  bugs to be fixed before release.
> > > 
> > >  Currently, there remains only one PR (
> > >  https://github.com/apache/incubator-druid/pull/6106 ) to be
> merged
> > > for
> > >  0.12.2. Once the Travis CI passes, I'll merge that PR shortly.
> Then,
> > > >>> we're
> > >  ready for 0.12.2-rc1 release.
> > > 
> > >  So, I think it's time to ask your opinion about creating
> 0.12.2-rc1
> > > >>> without
> > >  the release vote. I think it makes sense because we have already
> had
> > > >> two
> > >  votes (
> > > 
> > > 
> > > >>>
> > > >>
> > >
> >
> https://lists.apache.org/thread.html/a96f2e39506118be26184bd950bc51d360107d75e9ac547d8597817a@%3Cdev.druid.apache.org%3E
> > >  ,
> > > 
> > > 
> > > >>>
> > > >>
> > >
> >
> https://lists.apache.org/thread.html/11a50f22e7669a527625e190bebbe50b7586dd72733c3bf6a1024c02@%3Cdev.druid.apache.org%3E
> > >  )
> > >  for 0.12.2-rc1 release and there's no objection.
> > > 
> > >  If there's no objection for this for 48 hours, I'll start
> 0.12.2-rc1
> > >  release.
> > > 
> > >  Best,
> > >  Jihoon
> > > 
> > > >>> --
> > > >>> Niketh Sabbineni
> > > >>>
> > > >>
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > > For additional commands, e-mail: dev-h...@druid.apache.org
> > >
> > >
> >
>


Changing release process for release candidates

2018-08-06 Thread Jihoon Son
Hi all,

Our current release process for RCs begins with a vote. It usually takes up
a few days, but is actually not a mandatory process for creating RCs. If we
can reach consensus without explicit votes, we can expect the faster
release in the future.

The original discussion is available at
https://lists.apache.org/thread.html/d887f0c6e23f1625e549389c08a9a5e74a7a24db4d5e007b6e8d10f6@%3Cdev.druid.apache.org%3E
.

Welcome any idea.

Best,
Jihoon


Druid 0.12.2 release vote

2018-08-06 Thread Jihoon Son
Hi all,

Druid 0.12.2-rc1 (http://druid.io/downloads.html) is available now, and I
think it's time to vote on the 0.12.2 release. Please note that 0.12.2 is
not an ASF release.

Here is my +1.

Best,
Jihoon


Re: Podling Report (August 2018)

2018-08-06 Thread Julian Hyde
I have signed off.

Taylor and Jun please sign off also.


> On Aug 6, 2018, at 6:43 AM, Gian Merlino  wrote:
> 
> It looks like the page is up now, so I posted this report there.
> 
> On Thu, Aug 2, 2018 at 3:03 PM Gian Merlino  wrote:
> 
>> That sounds like a good question for gene...@incubator.apache.org.
>> 
>> On Thu, Aug 2, 2018 at 2:22 PM Jonathan Wei  wrote:
>> 
>>> Hm, looks like the wiki page https://wiki.apache.org/incubator/August2018
>>> still
>>> doesn't exist, any idea when it'll be up?
>>> 
>>> 
>>> On Thu, Aug 2, 2018 at 1:04 PM, Julian Hyde  wrote:
>>> 
 Please email the mentors (or the dev list) when it is posted to the wiki
 and is ready for sign-off.
 
 Julian
 
 
> On Aug 1, 2018, at 7:19 PM, Jonathan Wei  wrote:
> 
> I don't see the incubator wiki page for August 2018 up yet, so I'll
>>> post
> the current report here for now:
> 
> 
> Druid Podling Report (August 2018)
> 
> 
> Druid is a high-performance, column-oriented, distributed data store.
> 
> Druid has been incubating since 2018-02-28.
> 
> Three most important issues to address in the move towards graduation:
> 
> 1. Plan and execute our first Apache release.
> 2. Move the website to Apache infrastructure.
> 3. Expanding the community and adding more committers
> 
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
 aware
> of?
> 
> - None.
> 
> How has the community developed since the last report?
> 
> - A healthy, constant flow of bug fixes, quality improvements and new
> features
> are still ongoing at https://github.com/apache/incubator-druid.
> - Our next community meetup has been scheduled for August 8.
> 
> How has the project developed since the last report?
> 
> - This report covers activity since the May 2018 report
> - Source has been migrated to Apache infrastructure
> - License header updates are almost complete
> - Since the last report there have been 93 commits from 20
>>> individuals.
> - We have released 0.12.1, a non-incubator release.
> - We currently voting on the 0.12.2 non-incubator bug fix release.
>>> This
> will be our final non-incubator release.
> 
> How would you assess the podling's maturity?
> Please feel free to add your own commentary.
> 
> [ ] Initial setup
> [X] Working towards first release
> [ ] Community building
> [ ] Nearing graduation
> [ ] Other:
> 
> Date of last release:
> 
> - Druid 0.12.1 on 2018-06-08 (non-Apache release)
> - No official Apache release yet since beginning Apache Incubation
> 
> When were the last committers or PPMC members elected?
> 
> - Project is still functioning with the initial set of committers.
> 
> On Tue, Jul 31, 2018 at 4:02 PM, Jonathan Wei 
>>> wrote:
> 
>> Thanks for reviewing.
>> 
>> Our last meetup was in March, but we have an upcoming meetup on
>>> August 8
>> (after the report is due I assume), should we mention that in this
 report?
>> 
>> I noticed the incubator wiki page for August 2018 hasn't been created
 yet,
>> does anyone know if that's expected to be up soon? (Not sure on what
 exact
>> date our podling report is due)
>> 
>> - Jon
>> 
>> 
>> On Sun, Jul 29, 2018 at 10:30 AM, Julian Hyde <
>>> jhyde.apa...@gmail.com>
>> wrote:
>> 
>>> Thanks - this looks good.
>>> 
>>> I’d not include the url for the case to replace license headers.
 Reports
>>> rarely include urls whose sole purpose is to prove statements in the
>>> report.
>>> 
>>> If there have been meet ups/talks about Druid, mention them. Druid
>>> has
 a
>>> vibrant community, as a result of your ongoing community building
>>> activities such as talks, but still, you should take credit for
>>> them.
>>> 
>>> Julian
>>> 
 On Jul 28, 2018, at 10:24 AM, Jonathan Wei 
>>> wrote:
 
 Hi all,
 
 I'm posting a draft of the August 2018 report (which covers
>>> activity
>>> since
 our last report in May):
 
 
 
 Druid Podling Report (August 2018)
 
 
 Druid is a high-performance, column-oriented, distributed data
>>> store.
 
 Druid has been incubating since 2018-02-28.
 
 Three most important issues to address in the move towards
>>> graduation:
 
 1. Plan and execute our first Apache release.
 2. Move the website to Apache infrastructure.
 3. Expanding the community and adding more committers
 
 Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to
>>> be
>>> aware
 of?
 
 - None.
 
 How has the community develo

Re: About creating 0.12.2-rc1

2018-08-06 Thread Gian Merlino
Thanks Jihoon! Would you mind starting the other thread too when you get a
chance?

On Mon, Aug 6, 2018 at 10:13 AM Jihoon Son  wrote:

> Thanks guys.
>
> I'm creating 0.12.2-rc1 now.
>
> Regarding creating an RC without vote, I think it's worth to have a
> discussion in another thread to make sure everyone knows about the new RC
> release process.
>
> Best,
> Jihoon
>
> On Sun, Aug 5, 2018 at 10:30 AM Julian Hyde 
> wrote:
>
> > Gian is correct. Creating an RC doesn’t require a vote. It does require a
> > release manager. Usually in Calcite we determine the timeframe of the
> > release, and choose an RM, by a discussion that reaches consensus without
> > an explicit vote.
> >
> > The RM may do a little “traffic control”, asking whether people consider
> > the branch is in good shape, and perhaps asking people to stop pushing,
> > again by a non-vote email thread.
> >
> > Julian
> >
> > > On Aug 5, 2018, at 8:56 AM, Gian Merlino  wrote:
> > >
> > > +1, and fwiw, it looks like Apache projects don't always need to do
> votes
> > > for creating release candidates. For example on the Calcite mailing
> list
> > I
> > > see votes for _final_ releases, but the release candidates seem to be
> > > created and uploaded without a vote. There is generally some discussion
> > on
> > > the list about whether it's a good time to do a release candidate, but
> I
> > > don't generally see formal votes. I think something similar could work
> > for
> > > us in the future and could help us get releases out quicker.
> > >
> > > On Fri, Aug 3, 2018 at 9:38 PM Prashant Deva 
> > > wrote:
> > >
> > >> +1
> > >> Prashant
> > >>
> > >>
> > >> On Fri, Aug 3, 2018 at 7:11 PM Niketh Sabbineni <
> > >> niketh.sabbin...@gmail.com>
> > >> wrote:
> > >>
> > >>> +1
> > >>>
> > >>> Looking forward to this
> > >>>
> >  On Fri, Aug 3, 2018 at 7:09 PM Jihoon Son 
> > wrote:
> > 
> >  Hi folks,
> > 
> >  Releasing 0.12.2 has been delayed because, fortunately, we could
> find
> > >>> more
> >  bugs to be fixed before release.
> > 
> >  Currently, there remains only one PR (
> >  https://github.com/apache/incubator-druid/pull/6106 ) to be merged
> > for
> >  0.12.2. Once the Travis CI passes, I'll merge that PR shortly. Then,
> > >>> we're
> >  ready for 0.12.2-rc1 release.
> > 
> >  So, I think it's time to ask your opinion about creating 0.12.2-rc1
> > >>> without
> >  the release vote. I think it makes sense because we have already had
> > >> two
> >  votes (
> > 
> > 
> > >>>
> > >>
> >
> https://lists.apache.org/thread.html/a96f2e39506118be26184bd950bc51d360107d75e9ac547d8597817a@%3Cdev.druid.apache.org%3E
> >  ,
> > 
> > 
> > >>>
> > >>
> >
> https://lists.apache.org/thread.html/11a50f22e7669a527625e190bebbe50b7586dd72733c3bf6a1024c02@%3Cdev.druid.apache.org%3E
> >  )
> >  for 0.12.2-rc1 release and there's no objection.
> > 
> >  If there's no objection for this for 48 hours, I'll start 0.12.2-rc1
> >  release.
> > 
> >  Best,
> >  Jihoon
> > 
> > >>> --
> > >>> Niketh Sabbineni
> > >>>
> > >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > For additional commands, e-mail: dev-h...@druid.apache.org
> >
> >
>


Re: About creating 0.12.2-rc1

2018-08-06 Thread Jihoon Son
Thanks guys.

I'm creating 0.12.2-rc1 now.

Regarding creating an RC without vote, I think it's worth to have a
discussion in another thread to make sure everyone knows about the new RC
release process.

Best,
Jihoon

On Sun, Aug 5, 2018 at 10:30 AM Julian Hyde  wrote:

> Gian is correct. Creating an RC doesn’t require a vote. It does require a
> release manager. Usually in Calcite we determine the timeframe of the
> release, and choose an RM, by a discussion that reaches consensus without
> an explicit vote.
>
> The RM may do a little “traffic control”, asking whether people consider
> the branch is in good shape, and perhaps asking people to stop pushing,
> again by a non-vote email thread.
>
> Julian
>
> > On Aug 5, 2018, at 8:56 AM, Gian Merlino  wrote:
> >
> > +1, and fwiw, it looks like Apache projects don't always need to do votes
> > for creating release candidates. For example on the Calcite mailing list
> I
> > see votes for _final_ releases, but the release candidates seem to be
> > created and uploaded without a vote. There is generally some discussion
> on
> > the list about whether it's a good time to do a release candidate, but I
> > don't generally see formal votes. I think something similar could work
> for
> > us in the future and could help us get releases out quicker.
> >
> > On Fri, Aug 3, 2018 at 9:38 PM Prashant Deva 
> > wrote:
> >
> >> +1
> >> Prashant
> >>
> >>
> >> On Fri, Aug 3, 2018 at 7:11 PM Niketh Sabbineni <
> >> niketh.sabbin...@gmail.com>
> >> wrote:
> >>
> >>> +1
> >>>
> >>> Looking forward to this
> >>>
>  On Fri, Aug 3, 2018 at 7:09 PM Jihoon Son 
> wrote:
> 
>  Hi folks,
> 
>  Releasing 0.12.2 has been delayed because, fortunately, we could find
> >>> more
>  bugs to be fixed before release.
> 
>  Currently, there remains only one PR (
>  https://github.com/apache/incubator-druid/pull/6106 ) to be merged
> for
>  0.12.2. Once the Travis CI passes, I'll merge that PR shortly. Then,
> >>> we're
>  ready for 0.12.2-rc1 release.
> 
>  So, I think it's time to ask your opinion about creating 0.12.2-rc1
> >>> without
>  the release vote. I think it makes sense because we have already had
> >> two
>  votes (
> 
> 
> >>>
> >>
> https://lists.apache.org/thread.html/a96f2e39506118be26184bd950bc51d360107d75e9ac547d8597817a@%3Cdev.druid.apache.org%3E
>  ,
> 
> 
> >>>
> >>
> https://lists.apache.org/thread.html/11a50f22e7669a527625e190bebbe50b7586dd72733c3bf6a1024c02@%3Cdev.druid.apache.org%3E
>  )
>  for 0.12.2-rc1 release and there's no objection.
> 
>  If there's no objection for this for 48 hours, I'll start 0.12.2-rc1
>  release.
> 
>  Best,
>  Jihoon
> 
> >>> --
> >>> Niketh Sabbineni
> >>>
> >>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> For additional commands, e-mail: dev-h...@druid.apache.org
>
>


Re: 0.10.1 and 0.12.2 group by/search/select/top n query results

2018-08-06 Thread Gian Merlino
Hi Samarth,

It looks like you posted this message twice with two different subjects. I
responded to the other one, titled "Different query results for 0.12.2 and
0.10.1".

On Sun, Aug 5, 2018 at 9:41 PM Samarth Jain  wrote:

> I have an internal test harness setup that I am using for testing version
> upgrade from Druid 0.10.1 to 0.12.2. As part of the testing, I noticed that
> executing the same query against the same data source gives slightly
> different results for 0.10.1 and 0.12.2. I have seen this happen for
> search, group by, top n, select query types. The common part in all such
> queries is that they have a paging spec with descending set to false.
>
> "pagingSpec": {"pagingIdentifiers": {}, "threshold": 5000}
> "desceding": false
>
> My guess is that the data is distributed slightly differently within the
> two clusters which is causing this mismatch. Is my guess correct? If so, is
> there a way to make this comparison deterministic.
>
> The other thing that I observed is that with doubleSum aggregation type,
> 0.10.1 is returning values with lower precision (ex - 616346.0) as opposed
> to 0.12.1 (ex - 616346.0208094628). Did something change to cause this
> change in precision?
>


Re: Different query results for 0.12.2 and 0.10.1

2018-08-06 Thread Gian Merlino
Hi Samarth,

The doubleSum difference is likely due to the fact that before 0.11.0,
Druid read values out of columns as 32 bit floats and then cast them to 64
bit doubles. Now it can read them directly as 64 bit doubles. And actually,
it can _store_ floating point values as 64 bit doubles too, although this
won't be enabled by default until 0.13.0 (see
http://druid.io/docs/latest/configuration/index.html#double-column-storage
for how to enable it today).

Some thoughts on specific query types:

- The ordering of select results can vary due to differing choices about
which segments to read first. The results will stay in time order, but two
results with the same timestamp might swap positions. Btw, if you don't
need the strict time ordering guarantees, consider Scan queries (
http://druid.io/docs/latest/querying/scan-query.html) which are much
lighter in terms of memory usage.
- The exact ranking and values of TopN results can also vary, since topNs
are approximate and their results can vary based on which segments are
processed in which order and on which servers.
- GroupBy I would not expect to vary: what kinds of differences are you
seeing there?
- Search I'm not familiar with enough to think of a reason why it should or
shouldn't vary.

One thing you can do to try to get more consistent results for comparison
is add "bySegment" : true to your context. This will skip the merging step,
and just return sub-results for each segment individually. Most of the
potential variation is introduced in the merging step, so this should give
you more consistent results. With the caveat that it means you won't be
getting to test the merging step.

On Sun, Aug 5, 2018 at 10:55 PM Samarth Jain  wrote:

> I have an internal test harness setup that I am using for testing version
> upgrade from Druid 0.10.1 to 0.12.2. As part of the testing, I noticed that
> executing the same query against the same data sources(on different druid
> clusters) gives slightly different results for 0.10.1 and 0.12.2. I have
> seen this happen for search, group by, top n, select query types. The
> common part in all such queries is that they have a paging spec with
> descending set to false.
>
> "pagingSpec": {"pagingIdentifiers": {}, "threshold": 5000}
> "desceding": false
>
> My guess is that data distribution is slightly differently within the two
> clusters which combined with paging spec is causing this mismatch. Is my
> guess correct? If so, is there a way to make such kind of testing
> deterministic.
>
> The other thing that I observed is that with doubleSum aggregation type,
> 0.10.1 is returning values with lower precision (ex - 616346.0) as opposed
> to 0.12.1 (ex - 616346.0208094628). Did something change to cause this
> change in precision?
>


Re: Podling Report (August 2018)

2018-08-06 Thread Gian Merlino
It looks like the page is up now, so I posted this report there.

On Thu, Aug 2, 2018 at 3:03 PM Gian Merlino  wrote:

> That sounds like a good question for gene...@incubator.apache.org.
>
> On Thu, Aug 2, 2018 at 2:22 PM Jonathan Wei  wrote:
>
>> Hm, looks like the wiki page https://wiki.apache.org/incubator/August2018
>> still
>> doesn't exist, any idea when it'll be up?
>>
>>
>> On Thu, Aug 2, 2018 at 1:04 PM, Julian Hyde  wrote:
>>
>> > Please email the mentors (or the dev list) when it is posted to the wiki
>> > and is ready for sign-off.
>> >
>> > Julian
>> >
>> >
>> > > On Aug 1, 2018, at 7:19 PM, Jonathan Wei  wrote:
>> > >
>> > > I don't see the incubator wiki page for August 2018 up yet, so I'll
>> post
>> > > the current report here for now:
>> > >
>> > >
>> > > Druid Podling Report (August 2018)
>> > > 
>> > >
>> > > Druid is a high-performance, column-oriented, distributed data store.
>> > >
>> > > Druid has been incubating since 2018-02-28.
>> > >
>> > > Three most important issues to address in the move towards graduation:
>> > >
>> > > 1. Plan and execute our first Apache release.
>> > > 2. Move the website to Apache infrastructure.
>> > > 3. Expanding the community and adding more committers
>> > >
>> > > Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
>> > aware
>> > > of?
>> > >
>> > > - None.
>> > >
>> > > How has the community developed since the last report?
>> > >
>> > > - A healthy, constant flow of bug fixes, quality improvements and new
>> > > features
>> > >  are still ongoing at https://github.com/apache/incubator-druid.
>> > > - Our next community meetup has been scheduled for August 8.
>> > >
>> > > How has the project developed since the last report?
>> > >
>> > > - This report covers activity since the May 2018 report
>> > > - Source has been migrated to Apache infrastructure
>> > > - License header updates are almost complete
>> > > - Since the last report there have been 93 commits from 20
>> individuals.
>> > > - We have released 0.12.1, a non-incubator release.
>> > > - We currently voting on the 0.12.2 non-incubator bug fix release.
>> This
>> > > will be our final non-incubator release.
>> > >
>> > > How would you assess the podling's maturity?
>> > > Please feel free to add your own commentary.
>> > >
>> > >  [ ] Initial setup
>> > >  [X] Working towards first release
>> > >  [ ] Community building
>> > >  [ ] Nearing graduation
>> > >  [ ] Other:
>> > >
>> > > Date of last release:
>> > >
>> > > - Druid 0.12.1 on 2018-06-08 (non-Apache release)
>> > > - No official Apache release yet since beginning Apache Incubation
>> > >
>> > > When were the last committers or PPMC members elected?
>> > >
>> > > - Project is still functioning with the initial set of committers.
>> > >
>> > > On Tue, Jul 31, 2018 at 4:02 PM, Jonathan Wei 
>> wrote:
>> > >
>> > >> Thanks for reviewing.
>> > >>
>> > >> Our last meetup was in March, but we have an upcoming meetup on
>> August 8
>> > >> (after the report is due I assume), should we mention that in this
>> > report?
>> > >>
>> > >> I noticed the incubator wiki page for August 2018 hasn't been created
>> > yet,
>> > >> does anyone know if that's expected to be up soon? (Not sure on what
>> > exact
>> > >> date our podling report is due)
>> > >>
>> > >> - Jon
>> > >>
>> > >>
>> > >> On Sun, Jul 29, 2018 at 10:30 AM, Julian Hyde <
>> jhyde.apa...@gmail.com>
>> > >> wrote:
>> > >>
>> > >>> Thanks - this looks good.
>> > >>>
>> > >>> I’d not include the url for the case to replace license headers.
>> > Reports
>> > >>> rarely include urls whose sole purpose is to prove statements in the
>> > >>> report.
>> > >>>
>> > >>> If there have been meet ups/talks about Druid, mention them. Druid
>> has
>> > a
>> > >>> vibrant community, as a result of your ongoing community building
>> > >>> activities such as talks, but still, you should take credit for
>> them.
>> > >>>
>> > >>> Julian
>> > >>>
>> >  On Jul 28, 2018, at 10:24 AM, Jonathan Wei 
>> wrote:
>> > 
>> >  Hi all,
>> > 
>> >  I'm posting a draft of the August 2018 report (which covers
>> activity
>> > >>> since
>> >  our last report in May):
>> > 
>> > 
>> > 
>> >  Druid Podling Report (August 2018)
>> >  
>> > 
>> >  Druid is a high-performance, column-oriented, distributed data
>> store.
>> > 
>> >  Druid has been incubating since 2018-02-28.
>> > 
>> >  Three most important issues to address in the move towards
>> graduation:
>> > 
>> >  1. Plan and execute our first Apache release.
>> >  2. Move the website to Apache infrastructure.
>> >  3. Expanding the community and adding more committers
>> > 
>> >  Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to
>> be
>> > >>> aware
>> >  of?
>> > 
>> >  - None.
>> > 
>> >  How has the community developed since the last report?
>> > 
>> >  - A healthy, constant flo

Podling Report Reminder - August 2018

2018-08-06 Thread jmclean
Dear podling,

This email was sent by an automated system on behalf of the Apache
Incubator PMC. It is an initial reminder to give you plenty of time to
prepare your quarterly board report.

The board meeting is scheduled for Wed, 15 August 2018, 10:30 am PDT.
The report for your podling will form a part of the Incubator PMC
report. The Incubator PMC requires your report to be submitted 2 weeks
before the board meeting, to allow sufficient time for review and
submission (Wed, August 01).

Please submit your report with sufficient time to allow the Incubator
PMC, and subsequently board members to review and digest. Again, the
very latest you should submit your report is 2 weeks prior to the board
meeting.

Candidate names should not be made public before people are actually
elected, so please do not include the names of potential committers or
PPMC members in your report.

Thanks,

The Apache Incubator PMC

Submitting your Report

--

Your report should contain the following:

*   Your project name
*   A brief description of your project, which assumes no knowledge of
the project or necessarily of its field
*   A list of the three most important issues to address in the move
towards graduation.
*   Any issues that the Incubator PMC or ASF Board might wish/need to be
aware of
*   How has the community developed since the last report
*   How has the project developed since the last report.
*   How does the podling rate their own maturity.

This should be appended to the Incubator Wiki page at:

https://wiki.apache.org/incubator/August2018

Note: This is manually populated. You may need to wait a little before
this page is created from a template.

Mentors
---

Mentors should review reports for their project(s) and sign them off on
the Incubator wiki page. Signing off reports shows that you are
following the project - projects that are not signed may raise alarms
for the Incubator PMC.

Incubator PMC

-
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org