[Gluster-users] Gluster 3.7.17 distributed-replicated volume experiences almost regular Gluster internal NFS subprocess crash (CentOS 7.2)

Giuseppe Ragusa Tue, 29 Nov 2016 14:44:54 -0800

Hi all,

I'm writing to kindly ask for help on the issue in subject line above and 
documented in:



https://bugzilla.redhat.com/show_bug.cgi?id=1381970


Brief recap:


a 3-node replicated (with arbiter, confined on the same dedicated node for all 
volumes) distributed volume cluster experiences regular nfs crashes on at least 
one (non arbiter) node at a time (all two non arbiter nodes crash if given 
enough time without enacting the workaround cited below); there are no Gluster 
native clients, only NFS ones, all on a dedicated network.


Simply restarting an NFS-enabled volume restarts the nfs services on all (non 
arbiter) nodes for all volumes and all seems well up to the next crash (crashes 
happen many times a day under our normal workload).


Am almost sure way of making nfs crash immediately is recreating the yum 
metadata directory on a CentOS7 OS mirror repo hosted on a NFS-enabled volume.


Since it is a production cluster and we had to disable various cron jobs that 
were regularly crashing the internal NFS Gluster part (no NFS-Ganesha in use 
here), I am almost ready to accept even the upgrade to 3.8.x as a solution (I 
dare to say so since I've seen various fixes in Gerrit that were not being 
backported to 3.7 and one I even reported to Bugzilla, cloning the 3.8 bug and 
kindly asking for a backport, given that the patch applied cleanly; this brings 
the question: is the backporting of patches to 3.7 being phased out if not 
explicitly requested for?).

The only caveat could be that the cluster is an hyperconverged setup with oVirt 
3.6.7 (but the oVirt part with its dedicated Gluster volumes is working 
flawlessly and is absolutely not being used to manage Gluster, only to monitor 
it), so I would need to check for 3.8 compatibility before upgrading.


Many thanks in advance to anyone who can offer any advice on this issue.


Best regards,

Giuseppe

_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Gluster 3.7.17 distributed-replicated volume experiences almost regular Gluster internal NFS subprocess crash (CentOS 7.2)

Reply via email to