Hi Flavio, I have added some comments inline based on my understanding of the implementation but Ismael probably can add more to this from an architectural perspective.
Cheers, Sumit On Sat, Sep 3, 2016 at 8:48 AM, Flavio Junqueira <f...@apache.org> wrote: > Thanks for the KIP, Ismael, looks great. I have just a couple of comments > and questions: > > - I assume the znode containing the cluster id is persistent. Is there > ever the need to delete that znode to force a new instance id, and if so, > how is it expected to happen? I'm thinking about a scenario in which I have > a cluster, I wipe out intentionally to reset my state, and I want to start > with a fresh instance of the cluster. In such a case, it should be as > simple as deleting the znode manually, so I'm just wondering if this is > what we are going to expect from users. The alternative is to assume that > the zookeeper ensemble is also going to be wiped out in this case, or at > least the corresponding sub-tree. > As it is currently scoped, resetting the cluster id is easy. Delete the znode for cluster id manually and reboot the the entire Kafka cluster, a new id will be generated and used. This will become more complicated when we start writing the cluster id to meta.properties. I am not sure what should be behavior then, possible options are: rewrite the znode based on meta.properties, generate a new id or maybe the brokers refuse to boot with an error condition. > - I'm not actually suggesting that we do things manually, but I can see > some benefit in manually creating the cluster id znode and have brokers > only check that the znode is there before starting. > - One of the messages mentions changing names, which is a good point. If > it isn't necessary for correctness that all brokers have either no id or > the same id, then I'd think it is fine to just watch that znode for > changes. Brokers that are connect to zk will receive eventually a > notification. But, you point out that changing the name for a working > cluster isn't necessary a desirable feature, so this might not be needed. > > You are correct, we don't want users to change the cluster id once it is created. We want it to be immutable. We would allow the user to set a human readable name using resource tags in a future KIP. The operational burden of changing the human readable name would be easy as it will involve just updating a resource tag. Also, as resource tags would be part of the metadata, clients will receive updates to tags on the next metadata refresh and send events to interceptors, metric reporters and (de)serializers which implement the ClusterResourceListener interface. Cheers, > -Flavio > > > On 03 Sep 2016, at 08:46, sumit arrawatia <sumit.arrawa...@gmail.com> > wrote: > > > > Hi Dong, > > > > If you set the cluster.id in the config, the problem is how you > > change/update the cluster.id . > > > > > > You will need to change the all the config files and make sure every one > of > > them is correct as well as update the ZK metadata. This will require a > > reboot/downtime of the entire cluster, whereas generating ids (along with > > the yet-to-be-published resource tags KIP ) makes it easy for the admins > to > > update the human readable name without reboots/ clusterwide changes to > > config. > > > > > > Also, when we implement the change to write cluster.id in > meta.properties, > > the process of updating the cluster.id becomes even more complicated. > > > > > > 1. We will have to ensure that the entire process is transactional. It > will > > be painful from an operational standpoint as it would require as much > > operational downtime and care as a version upgrade because it will modify > > the on-disk data. The recovery/rollback scenario would be very difficult > > and would probably need manual changes to meta.properties. > > > > > > 2. Given that we would want to generate an error if the cluster.id in > > meta.properties doesn't match cluster.id in ZK, we will have to setup > > complicated logic in ZK to ensure we forgo the check when changing the > > cluster.id. I am not even sure how to do it properly for a rolling > upgrade > > without downtime. > > > > > > All these points are based on my experience running Elasticsearch in > > production. ElasticSearch specifies cluster.name statically in the > > properties as well as includes it in the data directory name and changing > > it is a nightmare. > > > > > > You would think that naming changes should be rare but in my experience > > they are not. Sometimes typos creep in, sometimes you have to change the > > name to consolidate clusters or divide them and sometimes the infra team > > decides to change the deployment metadata. > > > > > > This is why I think AWS approach of (assigning immutable ids to > resources + > > human readable names in tags) works very well operationally. > > > > > > Hope this helps ! > > > > > > Cheers, > > > > Sumit > > > > On Fri, Sep 2, 2016 at 11:44 PM, Dong Lin <lindon...@gmail.com> wrote: > > > >> Hey Ismael, > >> > >> Thanks for your reply. Please see my comment inline. > >> > >> On Fri, Sep 2, 2016 at 8:28 PM, Ismael Juma <ism...@juma.me.uk> wrote: > >> > >>> Hi Dong, > >>> > >>> Thanks for your feedback. Comments inline. > >>> > >>> On Thu, Sep 1, 2016 at 7:51 PM, Dong Lin <lindon...@gmail.com> wrote: > >>>> > >>>> I share the view with Harsha and would like to understand how the > >> current > >>>> approach of randomly generating cluster.id compares with the approach > >> of > >>>> manually specifying it in meta.properties. > >>>> > >>> > >>> Harsha's suggestion in the thread was to store the generated id in > >>> meta.properties, not to manually specify it via meta.properties. > >>> > >>>> > >>>> I think one big advantage of defining it manually in zookeeper is that > >> we > >>>> can easily tell which cluster it is by simply looking at the sensor > >> name, > >>>> which makes it more useful to the auditing or monitoring use-case that > >>> this > >>>> KIP intends to address. > >>> > >>> > >>> If you really want to customise the name, it is possible with the > current > >>> proposal: save the appropriate znode in ZooKeeper before a broker > >>> auto-generates it. We don't encourage that because once you have a > >>> meaningful name, there's a good chance that you may want to change it > in > >>> the future. And things break down at that point. That's why we prefer > >>> having a generated, unique and immutable id complemented by a > changeable > >>> human readable name. As described in the KIP, we think the latter can > be > >>> achieved more generally via resource tags (which will be a separate > KIP). > >>> > >>> Can you elaborate what will break down if we need to change the name? > >> > >> Even if we can not change name because something will breakdown in that > >> case, it seems that it is still better to read id from config than > using a > >> randomly generated ID. In my suggested solution user can simply choose > not > >> to change the name and make sure there is unique id per cluster. In your > >> proposal you need to store the old cluster.id and manually restore it > in > >> zookeeper in some scenarios. What do you think? > >> > >> > >>>> On the other hand, if you can only tell whether two > >>>> sensors are measuring the same cluster or not. Also note that even > this > >>>> goal is not easily guaranteed, because you need an external mechanism > >> to > >>>> manually re-generate znode with the old cluster.id if znode is > deleted > >>> or > >>>> if the same cluster (w.r.t purpose) is changed to use a different > >>>> zookeeper. > >>>> > >>> > >>> If we assume that znodes can be deleted at random, the cluster id is > >>> probably the least of one's worries. And yes, when moving to a > >>> different ZooKeeper while wanting to retain the cluster id, you would > >> have > >>> to set the znode manually. This doesn't seem too onerous compared to > the > >>> other work you will have to do for this scenario. > >>> > >>> Maybe this work is not much compared to other work. But we can agree > that > >> no work is better than little work, right? I am interested to see if we > can > >> avoid the work and still meet the motivation and goals of this KIP. > >> > >> > >>>> I read your reply to Harsha but still I don't fully understand your > >>> concern > >>>> with that approach. I think the broker can simply register group.id > in > >>>> that > >>>> znode if it is not specified yet, in the same way that this KIP > >> proposes > >>> to > >>>> do it, right? Can you please elaborate more about your concern with > >> this > >>>> approach? > >>>> > >>> > >>> It's a bit difficult to answer this comment because it seems like the > >>> intent of your suggestion is different than Harsha's. > >>> > >>> I am not necessarily opposed to storing the cluster id in > meta.properties > >>> (note that we have one meta.properties per log.dir), but I think there > >> are > >>> a number of things that need to be discussed and I don't think we need > to > >>> block KIP-78 while that takes place. Delivering features incrementally > >> is a > >>> good thing in my opinion (KIP-31/32, KIP-33 and KIP-79 is a good recent > >>> example). > >>> > >> > >> If I understand it right, the motivation of this KIP is to allow > cluster to > >> be uniquely identified. This is a useful feature and I am not asking for > >> anything beyond this scope. It is just that reading cluster.id from > config > >> seems to be a better solution in order to meet the motivation and all > the > >> goals described in the KIP. More specifically, using cluster.id not > only > >> allows user to distinguish between different clusters, it also lets user > >> identify cluster. In comparison, randomly generated cluster.id allows > user > >> to distinguish cluster with a little bit more effort, and doesn't allow > >> user to identify a cluster by simply reading e.g. sensor name. Did I > miss > >> something here? > >> > >> > >>> > >>> Ismael > >>> > >>> P.S. For what is worth, the following version of the KIP includes an > >>> incomplete description (it assumes a single meta.properties, but there > >>> could be many) of what the broker would have to do if we wanted to save > >> to > >>> meta.properties and potentially restore the znode from it. The state > >> space > >>> becomes a lot more complex, increasing potential for bugs (we had a few > >> for > >>> generated broker ids). In contrast, the current proposal is very simple > >> and > >>> doesn't prevent us from introducing the additional functionality later. > >>> > >>> https://cwiki.apache.org/confluence/pages/viewpage. > >> action?pageId=65868433 > >>> > >> > >> IMO reading cluster.id from config should be as easy as reading broker > id > >> from config. Storing cluster.id from config in znode requires the same > >> amount of effort as storing randomly generated cluster.id in znode. > Maybe > >> I > >> missed something here. Can you point me to the section of the KIP that > >> explains why it is more difficult if we want to read cluster.id from > >> config? > >> > > > > > > > > -- > > Regards, > > Sumit > > -- Regards, Sumit