My guess is there is a corruption in vol list or peer list which has lead glusterd to get into a infinite loop of traversing a peer/volume list and CPU to hog up. Again this is a guess and I've not got a chance to take a detail look at the logs and the strace output.
I believe if you get to reboot the node again the problem will disappear. On Tue, 22 Aug 2017 at 20:07, Serkan Çoban <cobanser...@gmail.com> wrote: > As an addition perf top shows %80 libc-2.12.so __strcmp_sse42 during > glusterd %100 cpu usage > Hope this helps... > > On Tue, Aug 22, 2017 at 2:41 PM, Serkan Çoban <cobanser...@gmail.com> > wrote: > > Hi there, > > > > I have a strange problem. > > Gluster version in 3.10.5, I am testing new servers. Gluster > > configuration is 16+4 EC, I have three volumes, each have 1600 bricks. > > I can successfully create the cluster and volumes without any > > problems. I write data to cluster from 100 clients for 12 hours again > > no problem. But when I try to reboot a node, glusterd process hangs on > > %100 CPU usage and seems to do nothing, no brick processes come > > online. You can find strace of glusterd process for 1 minutes here: > > > > https://www.dropbox.com/s/c7bxfnbqxze1yus/gluster_strace.out?dl=0 > > > > Here is the glusterd logs: > > https://www.dropbox.com/s/hkstb3mdeil9a5u/glusterd.log?dl=0 > > > > > > By the way, reboot of one server completes without problem if I reboot > > the servers before creating any volumes. > _______________________________________________ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users -- - Atin (atinm)
_______________________________________________ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users