Hey Ismael,

Thanks for the KIP.

I share the view with Harsha and would like to understand how the current
approach of randomly generating cluster.id compares with the approach of
manually specifying it in meta.properties.

I think one big advantage of defining it manually in zookeeper is that we
can easily tell which cluster it is by simply looking at the sensor name,
which makes it more useful to the auditing or monitoring use-case that this
KIP intends to address. On the other hand, if you can only tell whether two
sensors are measuring the same cluster or not. Also note that even this
goal is not easily guaranteed, because you need an external mechanism to
manually re-generate znode with the old cluster.id if znode is deleted or
if the same cluster (w.r.t purpose) is changed to use a different zookeeper.

I read your reply to Harsha but still I don't fully understand your concern
with that approach. I think the broker can simply register group.id in that
znode if it is not specified yet, in the same way that this KIP proposes to
do it, right? Can you please elaborate more about your concern with this
approach?

Thanks,
Dong


On Thu, Sep 1, 2016 at 10:19 AM, Ismael Juma <ism...@juma.me.uk> wrote:

> Hi Jun,
>
> Thanks for the feedback. Using `Cluster` was appealing because it has the
> information we need and it is already a public class, so we would not need
> to introduce a new public class with a potentially confusing name. Having
> said that, I agree with your points and I have updated the KIP so that the
> listener is called ClusterResourceListener and we now pass a
> ClusterResource instance.
>
> I have also clarified the motivation section a little and added a paragraph
> about the fact that the cluster id is only stored in ZooKeeper. I think
> this is fine for the first version and we can tackle improvements in future
> KIPs.
>
> I intend to start a vote soon as it seems like people are generally fine
> with the current proposal.
>
> Ismael
>
> On Wed, Aug 31, 2016 at 4:34 PM, Jun Rao <j...@confluent.io> wrote:
>
> > Ismael,
> >
> > Thanks for the proposal. It looks good overall. Just a comment below.
> >
> > We are adding the following interface in ClusterListener and Cluster
> > includes all brokers' endpoint and the metadata of topic partitions.
> >
> > void onClusterUpdate(Cluster cluster);
> >
> >
> > On the broker side, will that method be called when there is broker or
> > topic/partition change in metadata cache? Another thing is that Cluster
> > only includes one endpoint. This makes sense for the client. However, on
> > the broker side, it's not clear which endpoint should be used.
> >
> > In general, I am not sure how useful it is for serializers, interceptors,
> > metric reporters to know all brokers endpoints and topic/partition
> > metadata.
> >
> > I was thinking we could instead pass in sth like a ClusterResourceMeta
> and
> > just include the clusterId for now. This way, we can guarantee that
> > onClusterUpdate()
> > will only be called once and it's easier to implement on the broker side.
> >
> > Jun
> >
> >
> > On Wed, Aug 31, 2016 at 1:24 AM, Ismael Juma <ism...@juma.me.uk> wrote:
> >
> > > Thanks for the feedback Guozhang. Comment inline.
> > >
> > > On Wed, Aug 31, 2016 at 2:49 AM, Guozhang Wang <wangg...@gmail.com>
> > wrote:
> > >
> > > > About logging / debugging with the cluster id: I think the random
> UUID
> > > > itself may not be very helpful for human-readable debugging
> > information,
> > > > and we'd better use the cluster name mentioned in future work in
> > logging.
> > > >
> > >
> > > We can also add the human-readable value once it's available. However,
> > the
> > > random UUID is still useful now. After all, we use Git commit hashes in
> > > many places and they are significantly longer than what we are
> proposing
> > > here (40 instead of 22 characters) . When comparing by eye, one can
> often
> > > just look at the first few characters to distinguish. Does that make
> > sense?
> > >
> > > Ismael
> > >
> >
>

Reply via email to