Hi Mat,

Ah, right. IMO https://github.com/apache/incubator-druid/pull/6742 is a
decent workaround towards making #6176 less of a problem. It would prevent
incorrect results from happening (the broker will not start up its http
server & announce itself, and so it won't get picked up by clients, if it
never got the initialization event). If paired with monitoring that
restarts unhealthy brokers, the issue should be fully worked-around in
practice.

Even though there's an (imo) viable workaround, it would still be good to
fix the root cause of #6176. I just raised
https://github.com/apache/incubator-druid/pull/6862 to update Curator and
see if that helps -- there is a bug fixed in the latest release that looks
like it could cause the behavior we're seeing (
https://issues.apache.org/jira/browse/CURATOR-476).

My feeling is that it's still reasonable to remove the experimental label
from Druid SQL in 0.14, especially since #6742 will make SQL and native
queries behave at parity (initialization getting missed will delay broker
startup for _both_ cases). So in that sense they are at least on the same
footing. And hopefully #6862 will fix them both, together.

On Tue, Jan 15, 2019 at 7:56 AM Pierre-Emile Ferron <pe.fer...@gmail.com>
wrote:

> A remaining issue with SQL is
> https://github.com/apache/incubator-druid/issues/6176
>
> We've seen it happen several times in production on 0.12, where thankfully
> SQL doesn't power anything critical. The current workarounds are:
> 1. Restart the broker. Obviously not a good solution.
> 2. Migrate to HTTP segment discovery. I'm fine with that, and we are
> actually planning to do it soon in our clusters, but I'm still concerned
> about other Druid users—the default setting is still ZK, which means that
> SQL would still have this issue by default.
>
> Before marking SQL as non-experimental, I'd suggest either fixing the root
> cause, or making HTTP segment discovery the default and then explicitly
> deprecating ZK segment discovery.
>
>
> On Mon, Jan 14, 2019 at 2:18 PM Gian Merlino <g...@apache.org> wrote:
>
> > I'd like to propose graduating a couple of features out of 'experimental'
> > status in 0.14. Both are popular features (judging by mailing list &
> github
> > issue/PR activity). Both have been around for a while and have attained a
> > good level of quality and stability of API & behavior. I believe removing
> > the 'experimental' banner from these features would more accurately
> reflect
> > reality, and be a good signal to the user community.
> >
> > 1) Kafka indexing service. First introduced in Druid 0.9.1, it went
> through
> > a major protocol change in Druid 0.12.0 that added incremental
> publishing,
> > & 'mixing' of data from different partitions. Subjectively, quality
> appears
> > to be getting more solid, based on frequency of bug reports and also
> based
> > on our own experiences running this in production. Finally- I believe it
> is
> > already much more robust than Tranquility, the only 'stable' alternative.
> >
> > 2) Druid SQL. First introduced in Druid 0.10.0. It isn't feature complete
> > yet (multi-value dimensions, datasketches, etc, remain unsupported) but
> the
> > API and behavior have been generally stable. No major issues around
> memory
> > / performance / etc regressions relative to native Druid queries are
> > outstanding. IMO, it is well on its way to becoming a first class way to
> > query Druid, and it is a good time to remove the 'experimental' banner.
> >
>

Reply via email to