Hi Harsha,

It's a good question. If your broker connects to a different zookeeper root
(whether on the same server or not), the outcome depends on whether a
cluster id already exists there. If it does, then that cluster id will be
used. If not, a new cluster id will be generated.

The existing KIP doesn't try to solve that problem although we listed the
following under "Future Improvements":

3. Use the cluster id to ensure that brokers are connected to the right
> cluster: it's useful, but something that can be done later via a separate
> KIP. One of the discussion points is how the broker knows its cluster id
> (e.g. via a config or by storing it after the first connection to the
> cluster).


One of the options matches your suggestion to store the cluster id in
`meta.properties`. We were thinking that it would make sense to reject the
connection if the cluster id did not match. In that case, migrating Kafka
to a different ZooKeeper root would require setting the cluster id on the
new root before migrating. Another option is to do it automatically if the
cluster id is not set in ZooKeeper (i.e. if there's a cluster id in
`meta.properties` and there isn't one in ZooKeeper, set the cluster id
in ZooKeeper). This is perhaps a bit too much magic as we don't
auto-migrate anything else like ACLs when you change the ZooKeeper root.

In any case, we think the above can be tackled in a subsequent KIP while
the existing one is valuable in its current form. Does that make sense?

Thanks,
Ismael

On Mon, Aug 29, 2016 at 6:51 PM, Harsha Chintalapani <ka...@harsha.io>
wrote:

> Ismael,
>            What happens when the cluster.id changes from initial value.
> Ex,
> Users changed their zookeeper.root and now new cluster.id generated. Do
> you
> think it would be useful to store this in meta.properties along with
> broker.id. So that we only generate it once and store it in disk.
>
> Thanks,
> Harsha
>
> On Sat, Aug 27, 2016 at 4:47 PM Gwen Shapira <g...@confluent.io> wrote:
>
> > Thanks Ismael, this looks great.
> >
> > One of the things you mentioned is that cluster ID will be useful in
> > log aggregation. Perhaps it makes sense to include cluster ID in the
> > log? For example, as one of the things a broker logs after startup?
> > And ideally clients would log that as well after successful parsing of
> > MetadataResponse?
> >
> > Gwen
> >
> >
> > On Sat, Aug 27, 2016 at 4:39 AM, Ismael Juma <ism...@juma.me.uk> wrote:
> > > Hi all,
> > >
> > > We've posted "KIP-78: Cluster Id" for discussion:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-78%3A+Cluster+Id
> > >
> > > Please take a look. Your feedback is appreciated.
> > >
> > > Thanks,
> > > Ismael
> >
> >
> >
> > --
> > Gwen Shapira
> > Product Manager | Confluent
> > 650.450.2760 | @gwenshap
> > Follow us: Twitter | blog
> >
>

Reply via email to