>From a design perspective 2 is a better choice. However I'd like to see a design on how cluster id will be generated and maintained (with peer addition/deletion scenarios, node replacement etc).
On Tue, Jan 14, 2020 at 1:42 PM Amar Tumballi <a...@kadalu.io> wrote: > Hello, > > As we are gearing up for Release-8, and its planning, I wanted to bring up > one of my favorite topics, 'Thin-Arbiter' (or Tie-Breaker/Metro-Cluster etc > etc). > > We have made thin-arbiter release in v7.0 itself, which works great, when > we have just 1 cluster of gluster. I am talking about a situation which > involves multiple gluster clusters, and easier management of thin-arbiter > nodes. (Ref: https://github.com/gluster/glusterfs/issues/763) > > I am working with a goal of hosting a thin-arbiter node service (free of > cost), for which any gluster deployment can connect, and save their cost of > an additional replica, which is required today to not get into split-brain > situation. Tie-breaker storage and process needs are so less that we can > easily handle all gluster deployments till date in just one machine. When I > looked at the code with this goal, I found that current implementation > doesn't support it, mainly because it uses 'volumename' in the file it > creates. This is good for 1 cluster, as we don't allow duplicate volume > names in a single cluster, or OK for multiple clusters, as long as volume > names are not colliding. > > To resolve this properly we have 2 options (as per my thinking now) to > make it truly global service. > > 1. Add 'volume-id' option in afr volume itself, so, each instance picks > the volume-id and uses it in thin-arbiter name. A variant of this is > submitted for review - https://review.gluster.org/23723 but as it uses > volume-id from io-stats, this particular patch fails in case of brick-mux > and shd-mux scenarios. A proper enhancement of this patch is, providing > 'volume-id' option in AFR itself, so glusterd (while generating volfiles) > sends the proper vol-id to instance. > > Pros: Minimal code changes to the above patch. > Cons: One more option to AFR (not exposed to users). > > 2. Add* cluster-id *to glusterd, and pass it to all processes. Let > replicate use this in thin-arbiter file. This too will solve the issue. > > Pros: A cluster-id is good to have in any distributed system, specially > when there are deployments which will be 3 node each in different clusters. > Identifying bricks, services as part of a cluster is better. > > Cons: Code changes are more, and in glusterd component. > > On another note, 1 above is purely for Thin-Arbiter feature only, where as > 2nd option would be useful in debugging, and other solutions which > involves multiple clusters. > > Let me know what you all think about this. This is good to be discussed in > next week's meeting, and taken to completion. > > Regards, > Amar > --- > https://kadalu.io > Storage made easy for k8s > > _______________________________________________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/441850968 > > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/441850968 > > Gluster-devel mailing list > Gluster-devel@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > >
_______________________________________________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel