Could you check the glusterd log at the other nodes, that would give you the hint of the exact issue. Also looking at .cmd_log_history will give you the time interval at which volume status commands are executed. If the gap is in milisecs then you are bound to hit it and its expected.
-Atin Sent from one plus one On Aug 3, 2015 7:32 PM, "Osborne, Paul (paul.osbo...@canterbury.ac.uk)" < paul.osbo...@canterbury.ac.uk> wrote: > > Hi, > > Last week I upgraded one of my gluster clusters (3 hosts with bricks as > replica 3) to 3.6.4 from 3.5.4 and all seemed well. > > Today I am getting reports that locking has failed: > > > gfse-cant-01:/var/log/glusterfs# gluster volume status > Locking failed on gfse-rh-01.core.canterbury.ac.uk. Please check log file > for details. > Locking failed on gfse-isr-01.core.canterbury.ac.uk. Please check log > file for details. > > Logs: > [2015-08-03 13:45:29.974560] E [glusterd-syncop.c:1640:gd_sync_task_begin] > 0-management: Locking Peers Failed. > [2015-08-03 13:49:48.273159] E [glusterd-syncop.c:105:gd_collate_errors] > 0-: Locking failed on gfse-rh-01.core.canterbury.ac.uk. Please ch > eck log file for details. > [2015-08-03 13:49:48.273778] E [glusterd-syncop.c:105:gd_collate_errors] > 0-: Locking failed on gfse-isr-01.core.canterbury.ac.uk. Please c > heck log file for details. > > > I am wondering if this is a new feature due to 3.6.4 or something that has > gone wrong. > > Restarting gluster entirely (btw the restart script does not actually > appear to kill the processes...) resolves the issue but then it repeats a > few minutes later which is rather suboptimal for a running service. > > Googling suggests that there may be simultaneous actions going on that can > cause a locking issue. > > I know that I have nagios running volume status <volname> for each of my > volumes on each host every few minutes however this is not new and has been > in place for the last 8-9 months that against 3.5 without issue so would > hope that this is not causing the issue. > > I am not sure where to look now tbh. > > > > > Paul Osborne > Senior Systems Engineer > Canterbury Christ Church University > Tel: 01227 782751 > _______________________________________________ > Gluster-users mailing list > Gluster-users@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users