Hi Timothy, Thanks for this report. This seems to be a genuine issue. I don't think we have a solution for this issue for now, other than may be making sure we point 'serverD' (or new server's IPs) as ServerA in /etc/hosts on that particular client as a hack.
Meantime, it would be great if you copy paste this in an issue ( https://github.com/gluster/glusterfs/issues/new), it would be good to track this. Regards, Amar On Wed, Oct 16, 2019 at 12:35 AM Timothy Orme <to...@ancestry.com> wrote: > Hello, > > I'm trying to setup an elastic gluster cluster and am running into a few > odd edge cases that I'm unsure how to address. I'll try and walk through > the setup as best I can. > > If I have a replica 3 distributed-replicated volume, with 2 replicated > volumes to start: > > MyVolume > Replica 1 > serverA > serverB > serverC > Replica 2 > serverD > serverE > serverF > > And the client mounts the volume with serverA as the primary volfile > server, and B & C as the backups. > > Then, if I perform a scale down event, it selects the first replica volume > as the one to remove. So I end up with a configuration like: > > MyVolume > Replica 2 > serverD > serverE > serverF > > Everything rebalances and works great. However, at this point, the client > has lost any connection with a volfile server. It knows about D, E, and F, > so my data is all fine, but it can no longer retrieve a volfile. In the > logs I see: > > [2019-10-15 17:21:59.232819] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify] > 0-glusterfsd-mgmt: Exhausted all volfile servers > > This becomes problematic when I try and scale back up, and add a > replicated volume back in: > > MyVolume > Replica 2 > serverD > serverE > serverF > Replica 3 > serverG > serverH > serverI > > And then rebalance the volume. Now, I have all my data present, but the > client only knows about D,E,F, so when I run an `ls` on a directory, only > about half of the files are returned, since the other half live on G,H,I > which the client doesn't know about. The data is still there, but it would > require a re-mount at one of the new servers. > > My question then, is there a way to have a more dynamic set of volfile > servers? What would be great is if there was a way to tell the mount to > fall back on the servers returned in the volfile itself in case the primary > one goes away. > > If there's not an easy way to do this, is there a flag on the mount helper > that can cause the mount to die or error out in the event that it is unable > to retrieve volfiles? The problem now is that it sort of silently fails > and returns incomplete file listings, which for my use cases can cause > improper processing of that data. I'd rather have it hard error than > provide bad results silently obviously. > > Hope that makes sense, if you need further clarity please let me know. > > Thanks, > Tim > > > ________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/118564314 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/118564314 > > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >
________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users