Hi Ahemad, Sorry for a lot of back and forth on this. But we might need a few more details to find the actual cause here. What version of gluster you are running on server and client nodes? Also provide the statedump [1] of the bricks and the client process when the hang is seen.
[1] https://docs.gluster.org/en/latest/Troubleshooting/statedump/ Regards, Karthik On Wed, Jun 17, 2020 at 9:25 AM ahemad_sh...@yahoo.com < ahemad_sh...@yahoo.com> wrote: > I have a 3 replica gluster volume created in 3 nodes and when one node is > down due to some issue and the clients not able access volume. This was the > issue. I have fixed the server and it is back. There was downtime at > client. I just want to avoid the downtime since it is 3 replica. > > I am testing the high availability now by making one of the brick server > rebooting or shut down manually. I just want to make volume accessible > always by client. That is the reason we went for replica volume. > > So I just would like to know how to make the client volume high available > even some VM or node which is having gluster volume goes down unexpectedly > had down time of 10 hours. > > > > Glusterfsd service is used to stop which is disabled in my cluster and I > see one more service running gluserd. > > Will starting glusterfsd service in all 3 replica nodes will help in > achieving what I am trying. > > Hope I am clear. > > Thanks, > Ahemad > > > > Thanks, > Ahemad > > > > On Tue, Jun 16, 2020 at 23:12, Strahil Nikolov > <hunter86...@yahoo.com> wrote: > In my cluster , the service is enabled and running. > > What actually is your problem ? > When a gluster brick process dies unexpectedly - all fuse clients will be > waiting for the timeout . > The service glusterfsd is ensuring that during system shutdown , the > brick procesees will be shutdown in such way that all native clients won't > 'hang' and wait for the timeout, but will directly choose another brick. > > The same happens when you manually run the kill script - all gluster > processes shutdown and all clients are redirected to another brick. > > Keep in mind that fuse mounts will also be killed both by the script and > the glusterfsd service. > > Best Regards, > Strahil Nikolov > > На 16 юни 2020 г. 19:48:32 GMT+03:00, ahemad shaik <ahemad_sh...@yahoo.com> > написа: > > Hi Strahil, > >I have the gluster setup on centos 7 cluster.I see glusterfsd service > >and it is in inactive state. > >systemctl status glusterfsd.service● glusterfsd.service - GlusterFS > >brick processes (stopping only) Loaded: loaded > >(/usr/lib/systemd/system/glusterfsd.service; disabled; vendor preset: > >disabled) Active: inactive (dead) > > > >so you mean starting this service in all the nodes where gluster > >volumes are created, will solve the issue ? > > > >Thanks,Ahemad > > > > > >On Tuesday, 16 June, 2020, 10:12:22 pm IST, Strahil Nikolov > ><hunter86...@yahoo.com> wrote: > > > > Hi ahemad, > > > >the script kills all gluster processes, so the clients won't wait > >for the timeout before switching to another node in the TSP. > > > >In CentOS/RHEL, there is a systemd service called > >'glusterfsd.service' that is taking care on shutdown to kill all > >processes, so clients won't hung. > > > >systemctl cat glusterfsd.service --no-pager > ># /usr/lib/systemd/system/glusterfsd.service > >[Unit] > >Description=GlusterFS brick processes (stopping only) > >After=network.target glusterd.service > > > >[Service] > >Type=oneshot > ># glusterd starts the glusterfsd processed on-demand > ># /bin/true will mark this service as started, RemainAfterExit keeps it > >active > >ExecStart=/bin/true > >RemainAfterExit=yes > ># if there are no glusterfsd processes, a stop/reload should not give > >an error > >ExecStop=/bin/sh -c "/bin/killall --wait glusterfsd || /bin/true" > >ExecReload=/bin/sh -c "/bin/killall -HUP glusterfsd || /bin/true" > > > >[Install] > >WantedBy=multi-user.target > > > >Best Regards, > >Strahil Nikolov > > > >На 16 юни 2020 г. 18:41:59 GMT+03:00, ahemad shaik > ><ahemad_sh...@yahoo.com> написа: > >> Hi, > >>I see there is a script file in below mentioned path in all nodes > >using > >>which gluster volume > >>created./usr/share/glusterfs/scripts/stop-all-gluster-processes.sh > >>I need to create a system service and when ever there is some server > >>down, we need to call this script or we need to have it run always it > >>will take care when some node is down to make sure that client will > >not > >>have any issues in accessing mount point ? > >>can you please share any documentation on how to use this.That will be > >>great help. > >>Thanks,Ahemad > >> > >> > >> > >> > >>On Tuesday, 16 June, 2020, 08:59:31 pm IST, Strahil Nikolov > >><hunter86...@yahoo.com> wrote: > >> > >> Hi Ahemad, > >> > >>You can simplify it by creating a systemd service that will call > >>the script. > >> > >>It was already mentioned in a previous thread (with example), so > >>you can just use it. > >> > >>Best Regards, > >>Strahil Nikolov > >> > >>На 16 юни 2020 г. 16:02:07 GMT+03:00, Hu Bert <revi...@googlemail.com> > >>написа: > >>>Hi, > >>> > >>>if you simply reboot or shutdown one of the gluster nodes, there > >might > >>>be a (short or medium) unavailability of the volume on the nodes. To > >>>avoid this there's script: > >>> > >>>/usr/share/glusterfs/scripts/stop-all-gluster-processes.sh (path may > >>>be different depending on distribution) > >>> > >>>If i remember correctly: this notifies the clients that this node is > >>>going to be unavailable (please correct me if the details are wrong). > >>>If i do reboots of one gluster node, i always call this script and > >>>never have seen unavailability issues on the clients. > >>> > >>> > >>>Regards, > >>>Hubert > >>> > >>>Am Mo., 15. Juni 2020 um 19:36 Uhr schrieb ahemad shaik > >>><ahemad_sh...@yahoo.com>: > >>>> > >>>> Hi There, > >>>> > >>>> I have created 3 replica gluster volume with 3 bricks from 3 nodes. > >>>> > >>>> "gluster volume create glustervol replica 3 transport tcp > >>node1:/data > >>>node2:/data node3:/data force" > >>>> > >>>> mounted on client node using below command. > >>>> > >>>> "mount -t glusterfs node4:/glustervol /mnt/" > >>>> > >>>> when any of the node (either node1,node2 or node3) goes down, > >>gluster > >>>mount/volume (/mnt) not accessible at client (node4). > >>>> > >>>> purpose of replicated volume is high availability but not able to > >>>achieve it. > >>>> > >>>> Is it a bug or i am missing anything. > >>>> > >>>> > >>>> Any suggestions will be great help!!! > >>>> > >>>> kindly suggest. > >>>> > >>>> Thanks, > >>>> Ahemad > >>>> > >>>> ________ > >>>> > >>>> > >>>> > >>>> Community Meeting Calendar: > >>>> > >>>> Schedule - > >>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > >>>> Bridge: https://bluejeans.com/441850968 > >>>> > >>>> Gluster-users mailing list > >>>> Gluster-users@gluster.org > >>>> https://lists.gluster.org/mailman/listinfo/gluster-users > >>>________ > >>> > >>> > >>> > >>>Community Meeting Calendar: > >>> > >>>Schedule - > >>>Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > >>>Bridge: https://bluejeans.com/441850968 > >>> > >>>Gluster-users mailing list > >>>Gluster-users@gluster.org > >>>https://lists.gluster.org/mailman/listinfo/gluster-users > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users