Yeah there are somewhat dirty ways to work around it, and I hadn't thought of 
this one.  Another option for us is to try and tag certain instances as volfile 
servers, and always prevent the autoscaler from removing them.  It would be 
nice though if this behavior could be added to gluster itself as in cloud 
environments we don't typically rely on having hostnames.  I think I'd also 
sleep better if it would error out if it couldn't retrieve that info.  Ill 
throw in the feature request.  Probably out of my element, but Ill also take a 
shot at adding in a PR for it as well.

Thanks!
Tim
________________________________
From: Amar Tumballi <ama...@gmail.com>
Sent: Tuesday, October 15, 2019 8:51 PM
To: Timothy Orme <to...@ancestry.com>
Cc: gluster-users <gluster-users@gluster.org>
Subject: [EXTERNAL] Re: [Gluster-users] Client Handling of Elastic Clusters

Hi Timothy,

Thanks for this report. This seems to be a genuine issue. I don't think we have 
a solution for this issue for now, other than may be making sure we point 
'serverD' (or new server's IPs) as ServerA in /etc/hosts on that particular 
client as a hack.

Meantime, it would be great if you copy paste this in an issue 
(https://github.com/gluster/glusterfs/issues/new<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_gluster_glusterfs_issues_new&d=DwMFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=d0SJB4ihnau-Oyws6GEzcipkV9DfxCuMbgdSRgXeuxM&m=I3p14b5dBeFbqIn849NH5VJ2Q-poj6lKcoHcz-GLkS0&s=GuDvwV3wYEoU640x1N5HASszG3-seGpCRAZkHi1t5Ps&e=>),
 it would be good to track this.

Regards,
Amar

On Wed, Oct 16, 2019 at 12:35 AM Timothy Orme 
<to...@ancestry.com<mailto:to...@ancestry.com>> wrote:
Hello,

I'm trying to setup an elastic gluster cluster and am running into a few odd 
edge cases that I'm unsure how to address.  I'll try and walk through the setup 
as best I can.

If I have a replica 3 distributed-replicated volume, with 2 replicated volumes 
to start:

MyVolume
   Replica 1
      serverA
      serverB
      serverC
   Replica 2
      serverD
      serverE
      serverF

And the client mounts the volume with serverA as the primary volfile server, 
and B & C as the backups.

Then, if I perform a scale down event, it selects the first replica volume as 
the one to remove.  So I end up with a configuration like:

MyVolume
   Replica 2
      serverD
      serverE
      serverF

Everything rebalances and works great.  However, at this point, the client has 
lost any connection with a volfile server.  It knows about D, E, and F, so my 
data is all fine, but it can no longer retrieve a volfile.  In the logs I see:

[2019-10-15 17:21:59.232819] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify] 
0-glusterfsd-mgmt: Exhausted all volfile servers

This becomes problematic when I try and scale back up, and add a replicated 
volume back in:

MyVolume
   Replica 2
      serverD
      serverE
      serverF
   Replica 3
      serverG
      serverH
      serverI

And then rebalance the volume.  Now, I have all my data present, but the client 
only knows about D,E,F, so when I run an `ls` on a directory, only about half 
of the files are returned, since the other half live on G,H,I which the client 
doesn't know about.  The data is still there, but it would require a re-mount 
at one of the new servers.

My question then, is there a way to have a more dynamic set of volfile servers? 
What would be great is if there was a way to tell the mount to fall back on the 
servers returned in the volfile itself in case the primary one goes away.

If there's not an easy way to do this, is there a flag on the mount helper that 
can cause the mount to die or error out in the event that it is unable to 
retrieve volfiles?  The problem now is that it sort of silently fails and 
returns incomplete file listings, which for my use cases can cause improper 
processing of that data.  I'd rather have it hard error than provide bad 
results silently obviously.

Hope that makes sense, if you need further clarity please let me know.

Thanks,
Tim


________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: 
https://bluejeans.com/118564314<https://urldefense.proofpoint.com/v2/url?u=https-3A__bluejeans.com_118564314&d=DwMFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=d0SJB4ihnau-Oyws6GEzcipkV9DfxCuMbgdSRgXeuxM&m=I3p14b5dBeFbqIn849NH5VJ2Q-poj6lKcoHcz-GLkS0&s=br4S0Z9tTnDXMg1i51vhfW5rJez2ys9oVzdToXQ8L7c&e=>

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: 
https://bluejeans.com/118564314<https://urldefense.proofpoint.com/v2/url?u=https-3A__bluejeans.com_118564314&d=DwMFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=d0SJB4ihnau-Oyws6GEzcipkV9DfxCuMbgdSRgXeuxM&m=I3p14b5dBeFbqIn849NH5VJ2Q-poj6lKcoHcz-GLkS0&s=br4S0Z9tTnDXMg1i51vhfW5rJez2ys9oVzdToXQ8L7c&e=>

Gluster-users mailing list
Gluster-users@gluster.org<mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users<https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gluster.org_mailman_listinfo_gluster-2Dusers&d=DwMFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=d0SJB4ihnau-Oyws6GEzcipkV9DfxCuMbgdSRgXeuxM&m=I3p14b5dBeFbqIn849NH5VJ2Q-poj6lKcoHcz-GLkS0&s=iJwIrN_yVs1whn80n-6-fo-eCagDmAvZlZ5-Hcad-Eg&e=>
________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to