Re: [Gluster-users] cannot add server back to cluster after reinstallation
Thanks all for the help! The cluster has been up for a few hours now with no reported errors, so I guess replacement of the server went ultimately fine ;-) Ciao, R ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] cannot add server back to cluster after reinstallation
- Original Message - From: "Atin Mukherjee" To: "Rafi Kavungal Chundattu Parambil" , "Riccardo Murri" Cc: gluster-users@gluster.org Sent: Wednesday, March 27, 2019 4:07:42 PM Subject: Re: [Gluster-users] cannot add server back to cluster after reinstallation On Wed, 27 Mar 2019 at 16:02, Riccardo Murri wrote: > Hello Atin, > > > Check cluster.op-version, peer status, volume status output. If they are > all fine you’re good. > > Both `op-version` and `peer status` look fine: > ``` > # gluster volume get all cluster.max-op-version > Option Value > -- - > cluster.max-op-version 31202 > > # gluster peer status > Number of Peers: 4 > > Hostname: glusterfs-server-004 > Uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318 > State: Peer in Cluster (Connected) > > Hostname: glusterfs-server-005 > Uuid: d53398f6-19d4-4633-8bc3-e493dac41789 > State: Peer in Cluster (Connected) > > Hostname: glusterfs-server-003 > Uuid: 3c74d2b4-a4f3-42d4-9511-f6174b0a641d > State: Peer in Cluster (Connected) > > Hostname: glusterfs-server-001 > Uuid: 60bcc47e-ccbe-493e-b4ea-d45d63123977 > State: Peer in Cluster (Connected) > ``` > > However, `volume status` shows a missing snapshotd on the reinstalled > server (the 002 one). I believe you ran this command on 002? And in that case its showing as localhost. > We're not using snapshots so I guess this is fine too? Is features.uss enabled for this volume? Otherwise we don’t show snapd information in status output. Rafi - am I correct? Yes. We don't show snapd information unless uss is enabled. So please check whether uss is enabled or not. You can use gluster v get glusterfs features.uss . If you are not using any snapshot then it doesn't make sense to use uss. You can disable it using gluster v set glusterfs features.uss disable Please note that if you are doing the rolling upgrade, it is not recommended to do any configuration changes. In that case you can disable it after completing the upgrade. Rafi KC > > ``` > # gluster volume status > Status of volume: glusterfs > Gluster process TCP Port RDMA Port Online > Pid > > -- > Brick glusterfs-server-005:/s > rv/glusterfs49152 0 Y > 1410 > Brick glusterfs-server-004:/s > rv/glusterfs49152 0 Y > 1416 > Brick glusterfs-server-003:/s > rv/glusterfs49152 0 Y > 1520 > Brick glusterfs-server-001:/s > rv/glusterfs49152 0 Y > 1266 > Brick glusterfs-server-002:/s > rv/glusterfs49152 0 Y > 3011 > Snapshot Daemon on localhostN/A N/AY > 3029 > Snapshot Daemon on glusterfs- > server-001 49153 0 Y > 1361 > Snapshot Daemon on glusterfs- > server-005 49153 0 Y > 1478 > Snapshot Daemon on glusterfs- > server-004 49153 0 Y > 1490 > Snapshot Daemon on glusterfs- > server-003 49153 0 Y > 1563 > > Task Status of Volume glusterfs > > -- > Task : Rebalance > ID : 0eaf6ad1-df95-48f4-b941-17488010ddcc > Status : failed > ``` > > Thanks, > Riccardo > -- --Atin ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] cannot add server back to cluster after reinstallation
On Wed, 27 Mar 2019 at 16:02, Riccardo Murri wrote: > Hello Atin, > > > Check cluster.op-version, peer status, volume status output. If they are > all fine you’re good. > > Both `op-version` and `peer status` look fine: > ``` > # gluster volume get all cluster.max-op-version > Option Value > -- - > cluster.max-op-version 31202 > > # gluster peer status > Number of Peers: 4 > > Hostname: glusterfs-server-004 > Uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318 > State: Peer in Cluster (Connected) > > Hostname: glusterfs-server-005 > Uuid: d53398f6-19d4-4633-8bc3-e493dac41789 > State: Peer in Cluster (Connected) > > Hostname: glusterfs-server-003 > Uuid: 3c74d2b4-a4f3-42d4-9511-f6174b0a641d > State: Peer in Cluster (Connected) > > Hostname: glusterfs-server-001 > Uuid: 60bcc47e-ccbe-493e-b4ea-d45d63123977 > State: Peer in Cluster (Connected) > ``` > > However, `volume status` shows a missing snapshotd on the reinstalled > server (the 002 one). I believe you ran this command on 002? And in that case its showing as localhost. > We're not using snapshots so I guess this is fine too? Is features.uss enabled for this volume? Otherwise we don’t show snapd information in status output. Rafi - am I correct? > > ``` > # gluster volume status > Status of volume: glusterfs > Gluster process TCP Port RDMA Port Online > Pid > > -- > Brick glusterfs-server-005:/s > rv/glusterfs49152 0 Y > 1410 > Brick glusterfs-server-004:/s > rv/glusterfs49152 0 Y > 1416 > Brick glusterfs-server-003:/s > rv/glusterfs49152 0 Y > 1520 > Brick glusterfs-server-001:/s > rv/glusterfs49152 0 Y > 1266 > Brick glusterfs-server-002:/s > rv/glusterfs49152 0 Y > 3011 > Snapshot Daemon on localhostN/A N/AY > 3029 > Snapshot Daemon on glusterfs- > server-001 49153 0 Y > 1361 > Snapshot Daemon on glusterfs- > server-005 49153 0 Y > 1478 > Snapshot Daemon on glusterfs- > server-004 49153 0 Y > 1490 > Snapshot Daemon on glusterfs- > server-003 49153 0 Y > 1563 > > Task Status of Volume glusterfs > > -- > Task : Rebalance > ID : 0eaf6ad1-df95-48f4-b941-17488010ddcc > Status : failed > ``` > > Thanks, > Riccardo > -- --Atin ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] cannot add server back to cluster after reinstallation
Hello Atin, > Check cluster.op-version, peer status, volume status output. If they are all > fine you’re good. Both `op-version` and `peer status` look fine: ``` # gluster volume get all cluster.max-op-version Option Value -- - cluster.max-op-version 31202 # gluster peer status Number of Peers: 4 Hostname: glusterfs-server-004 Uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318 State: Peer in Cluster (Connected) Hostname: glusterfs-server-005 Uuid: d53398f6-19d4-4633-8bc3-e493dac41789 State: Peer in Cluster (Connected) Hostname: glusterfs-server-003 Uuid: 3c74d2b4-a4f3-42d4-9511-f6174b0a641d State: Peer in Cluster (Connected) Hostname: glusterfs-server-001 Uuid: 60bcc47e-ccbe-493e-b4ea-d45d63123977 State: Peer in Cluster (Connected) ``` However, `volume status` shows a missing snapshotd on the reinstalled server (the 002 one). We're not using snapshots so I guess this is fine too? ``` # gluster volume status Status of volume: glusterfs Gluster process TCP Port RDMA Port Online Pid -- Brick glusterfs-server-005:/s rv/glusterfs49152 0 Y 1410 Brick glusterfs-server-004:/s rv/glusterfs49152 0 Y 1416 Brick glusterfs-server-003:/s rv/glusterfs49152 0 Y 1520 Brick glusterfs-server-001:/s rv/glusterfs49152 0 Y 1266 Brick glusterfs-server-002:/s rv/glusterfs49152 0 Y 3011 Snapshot Daemon on localhostN/A N/AY 3029 Snapshot Daemon on glusterfs- server-001 49153 0 Y 1361 Snapshot Daemon on glusterfs- server-005 49153 0 Y 1478 Snapshot Daemon on glusterfs- server-004 49153 0 Y 1490 Snapshot Daemon on glusterfs- server-003 49153 0 Y 1563 Task Status of Volume glusterfs -- Task : Rebalance ID : 0eaf6ad1-df95-48f4-b941-17488010ddcc Status : failed ``` Thanks, Riccardo ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] cannot add server back to cluster after reinstallation
On Wed, 27 Mar 2019 at 15:24, Riccardo Murri wrote: > I managed to put the reinstalled server back into connected state with > this procedure: > > 1. Run `for other_server in ...; do gluster peer probe $other_server; > done` on the reinstalled server > 2. Now all the peers on the reinstalled server show up as "Accepted > Peer Request", which I fixed with the procedure outlined in the last > paragraph of > https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-glusterd/#debugging-glusterd > > Can anyone confirm that this is a good way to proceed and I won't be > heading quickly towards corrupting volume data? Check cluster.op-version, peer status, volume status output. If they are all fine you’re good. > > Thanks, > Riccardo > ___ > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > -- --Atin ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] cannot add server back to cluster after reinstallation
+Sanju Rakonde & +Atin Mukherjee adding glusterd folks who can help here. On Wed, Mar 27, 2019 at 3:24 PM Riccardo Murri wrote: > I managed to put the reinstalled server back into connected state with > this procedure: > > 1. Run `for other_server in ...; do gluster peer probe $other_server; > done` on the reinstalled server > 2. Now all the peers on the reinstalled server show up as "Accepted > Peer Request", which I fixed with the procedure outlined in the last > paragraph of > https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-glusterd/#debugging-glusterd > > Can anyone confirm that this is a good way to proceed and I won't be > heading quickly towards corrupting volume data? > > Thanks, > Riccardo > ___ > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] cannot add server back to cluster after reinstallation
I managed to put the reinstalled server back into connected state with this procedure: 1. Run `for other_server in ...; do gluster peer probe $other_server; done` on the reinstalled server 2. Now all the peers on the reinstalled server show up as "Accepted Peer Request", which I fixed with the procedure outlined in the last paragraph of https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-glusterd/#debugging-glusterd Can anyone confirm that this is a good way to proceed and I won't be heading quickly towards corrupting volume data? Thanks, Riccardo ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] cannot add server back to cluster after reinstallation
Hello, a couple days ago, the OS disk of one of the server of a local GlusterFS cluster suffered a bad crash, and I had to reinstall everything from scratch. However, when I restart the GlusterFS service on the server that has been reinstalled, I see that it sends back a "RJT" response to other servers of the cluster, which then list it as "State: Peer Rejected (Connected)"; the reinstalled server instead shows "Number of peers: 0". The DEBUG level log on the reinstalled machine shows these lines after the peer probe from another server in the cluster: I [MSGID: 106490] [glusterd-handler.c:2540:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318 D [MSGID: 0] [glusterd-peer-utils.c:208:glusterd_peerinfo_find_by_uuid] 0-management: Friend with uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318, not found D [MSGID: 0] [glusterd-peer-utils.c:234:glusterd_peerinfo_find] 0-management: Unable to find peer by uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318 D [MSGID: 0] [glusterd-peer-utils.c:132:glusterd_peerinfo_find_by_hostname] 0-management: Unable to find friend: glusterfs-server-004 D [MSGID: 0] [glusterd-peer-utils.c:246:glusterd_peerinfo_find] 0-management: Unable to find hostname: glusterfs-server-004 I [MSGID: 106493] [glusterd-handler.c:3800:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to glusterfs-server-004 (24007), ret: 0, op_ret: -1 What can I do to re-add the reinstalled server into the cluster? Is it safe (= keeps data) to "peer detach" it and then "peer probe" again? Additional info: * The actual GlusterFS brick data was on a different disk and so is safe and mounted back in the original location. * I copied back the `/etc/glusterfs/glusterd.vol` from the other servers in the cluster and restored the UUID into `/var/lib/glusterfs/glusterd.info` * I have checked that `max.op-version` is the same on all servers of the cluster, including the reinstalled one. * All servers run Ubuntu 16.04 Thanks for any suggestion! Riccardo ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users