Re: [Gluster-users] cannot add server back to cluster after reinstallation

2019-03-27 Thread Riccardo Murri
Thanks all for the help!  The cluster has been up for a few hours now
with no reported errors, so I guess replacement of the server went
ultimately fine ;-)

Ciao,
R
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] cannot add server back to cluster after reinstallation

2019-03-27 Thread Rafi Kavungal Chundattu Parambil


- Original Message -
From: "Atin Mukherjee" 
To: "Rafi Kavungal Chundattu Parambil" , "Riccardo Murri" 

Cc: gluster-users@gluster.org
Sent: Wednesday, March 27, 2019 4:07:42 PM
Subject: Re: [Gluster-users] cannot add server back to cluster after 
reinstallation

On Wed, 27 Mar 2019 at 16:02, Riccardo Murri 
wrote:

> Hello Atin,
>
> > Check cluster.op-version, peer status, volume status output. If they are
> all fine you’re good.
>
> Both `op-version` and `peer status` look fine:
> ```
> # gluster volume get all cluster.max-op-version
> Option  Value
> --  -
> cluster.max-op-version  31202
>
> # gluster peer status
> Number of Peers: 4
>
> Hostname: glusterfs-server-004
> Uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318
> State: Peer in Cluster (Connected)
>
> Hostname: glusterfs-server-005
> Uuid: d53398f6-19d4-4633-8bc3-e493dac41789
> State: Peer in Cluster (Connected)
>
> Hostname: glusterfs-server-003
> Uuid: 3c74d2b4-a4f3-42d4-9511-f6174b0a641d
> State: Peer in Cluster (Connected)
>
> Hostname: glusterfs-server-001
> Uuid: 60bcc47e-ccbe-493e-b4ea-d45d63123977
> State: Peer in Cluster (Connected)
> ```
>
> However, `volume status` shows a missing snapshotd on the reinstalled
> server (the 002 one).


I believe you ran this command on 002? And in that case its showing as
localhost.


> We're not using snapshots so I guess this is fine too?


Is features.uss enabled for this volume? Otherwise we don’t show snapd
information in status output.

Rafi - am I correct?

Yes. We don't show snapd information unless uss is enabled. So please check 
whether uss is enabled or not.

You can use gluster v get glusterfs features.uss . If you are not using any 
snapshot then it doesn't make sense to use uss. You can disable it using 
gluster v set glusterfs features.uss disable


Please note that if you are doing the rolling upgrade, it is not recommended to 
do any configuration changes. In that case you can disable it after completing 
the upgrade.


Rafi KC

>
> ```
> # gluster volume status
> Status of volume: glusterfs
> Gluster process TCP Port  RDMA Port  Online
> Pid
>
> --
> Brick glusterfs-server-005:/s
> rv/glusterfs49152 0  Y
>  1410
> Brick glusterfs-server-004:/s
> rv/glusterfs49152 0  Y
>  1416
> Brick glusterfs-server-003:/s
> rv/glusterfs49152 0  Y
>  1520
> Brick glusterfs-server-001:/s
> rv/glusterfs49152 0  Y
>  1266
> Brick glusterfs-server-002:/s
> rv/glusterfs49152 0  Y
>  3011
> Snapshot Daemon on localhostN/A   N/AY
>  3029
> Snapshot Daemon on glusterfs-
> server-001  49153 0  Y
>  1361
> Snapshot Daemon on glusterfs-
> server-005  49153 0  Y
>  1478
> Snapshot Daemon on glusterfs-
> server-004  49153 0  Y
>  1490
> Snapshot Daemon on glusterfs-
> server-003  49153 0  Y
>  1563
>
> Task Status of Volume glusterfs
>
> --
> Task : Rebalance
> ID   : 0eaf6ad1-df95-48f4-b941-17488010ddcc
> Status   : failed
> ```
>
> Thanks,
> Riccardo
>
-- 
--Atin
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] cannot add server back to cluster after reinstallation

2019-03-27 Thread Atin Mukherjee
On Wed, 27 Mar 2019 at 16:02, Riccardo Murri 
wrote:

> Hello Atin,
>
> > Check cluster.op-version, peer status, volume status output. If they are
> all fine you’re good.
>
> Both `op-version` and `peer status` look fine:
> ```
> # gluster volume get all cluster.max-op-version
> Option  Value
> --  -
> cluster.max-op-version  31202
>
> # gluster peer status
> Number of Peers: 4
>
> Hostname: glusterfs-server-004
> Uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318
> State: Peer in Cluster (Connected)
>
> Hostname: glusterfs-server-005
> Uuid: d53398f6-19d4-4633-8bc3-e493dac41789
> State: Peer in Cluster (Connected)
>
> Hostname: glusterfs-server-003
> Uuid: 3c74d2b4-a4f3-42d4-9511-f6174b0a641d
> State: Peer in Cluster (Connected)
>
> Hostname: glusterfs-server-001
> Uuid: 60bcc47e-ccbe-493e-b4ea-d45d63123977
> State: Peer in Cluster (Connected)
> ```
>
> However, `volume status` shows a missing snapshotd on the reinstalled
> server (the 002 one).


I believe you ran this command on 002? And in that case its showing as
localhost.


> We're not using snapshots so I guess this is fine too?


Is features.uss enabled for this volume? Otherwise we don’t show snapd
information in status output.

Rafi - am I correct?


>
> ```
> # gluster volume status
> Status of volume: glusterfs
> Gluster process TCP Port  RDMA Port  Online
> Pid
>
> --
> Brick glusterfs-server-005:/s
> rv/glusterfs49152 0  Y
>  1410
> Brick glusterfs-server-004:/s
> rv/glusterfs49152 0  Y
>  1416
> Brick glusterfs-server-003:/s
> rv/glusterfs49152 0  Y
>  1520
> Brick glusterfs-server-001:/s
> rv/glusterfs49152 0  Y
>  1266
> Brick glusterfs-server-002:/s
> rv/glusterfs49152 0  Y
>  3011
> Snapshot Daemon on localhostN/A   N/AY
>  3029
> Snapshot Daemon on glusterfs-
> server-001  49153 0  Y
>  1361
> Snapshot Daemon on glusterfs-
> server-005  49153 0  Y
>  1478
> Snapshot Daemon on glusterfs-
> server-004  49153 0  Y
>  1490
> Snapshot Daemon on glusterfs-
> server-003  49153 0  Y
>  1563
>
> Task Status of Volume glusterfs
>
> --
> Task : Rebalance
> ID   : 0eaf6ad1-df95-48f4-b941-17488010ddcc
> Status   : failed
> ```
>
> Thanks,
> Riccardo
>
-- 
--Atin
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] cannot add server back to cluster after reinstallation

2019-03-27 Thread Riccardo Murri
Hello Atin,

> Check cluster.op-version, peer status, volume status output. If they are all 
> fine you’re good.

Both `op-version` and `peer status` look fine:
```
# gluster volume get all cluster.max-op-version
Option  Value
--  -
cluster.max-op-version  31202

# gluster peer status
Number of Peers: 4

Hostname: glusterfs-server-004
Uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318
State: Peer in Cluster (Connected)

Hostname: glusterfs-server-005
Uuid: d53398f6-19d4-4633-8bc3-e493dac41789
State: Peer in Cluster (Connected)

Hostname: glusterfs-server-003
Uuid: 3c74d2b4-a4f3-42d4-9511-f6174b0a641d
State: Peer in Cluster (Connected)

Hostname: glusterfs-server-001
Uuid: 60bcc47e-ccbe-493e-b4ea-d45d63123977
State: Peer in Cluster (Connected)
```

However, `volume status` shows a missing snapshotd on the reinstalled
server (the 002 one).
We're not using snapshots so I guess this is fine too?

```
# gluster volume status
Status of volume: glusterfs
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick glusterfs-server-005:/s
rv/glusterfs49152 0  Y   1410
Brick glusterfs-server-004:/s
rv/glusterfs49152 0  Y   1416
Brick glusterfs-server-003:/s
rv/glusterfs49152 0  Y   1520
Brick glusterfs-server-001:/s
rv/glusterfs49152 0  Y   1266
Brick glusterfs-server-002:/s
rv/glusterfs49152 0  Y   3011
Snapshot Daemon on localhostN/A   N/AY   3029
Snapshot Daemon on glusterfs-
server-001  49153 0  Y   1361
Snapshot Daemon on glusterfs-
server-005  49153 0  Y   1478
Snapshot Daemon on glusterfs-
server-004  49153 0  Y   1490
Snapshot Daemon on glusterfs-
server-003  49153 0  Y   1563

Task Status of Volume glusterfs
--
Task : Rebalance
ID   : 0eaf6ad1-df95-48f4-b941-17488010ddcc
Status   : failed
```

Thanks,
Riccardo
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] cannot add server back to cluster after reinstallation

2019-03-27 Thread Atin Mukherjee
On Wed, 27 Mar 2019 at 15:24, Riccardo Murri 
wrote:

> I managed to put the reinstalled server back into connected state with
> this procedure:
>
> 1. Run `for other_server in ...; do gluster peer probe $other_server;
> done` on the reinstalled server
> 2. Now all the peers on the reinstalled server show up as "Accepted
> Peer Request", which I fixed with the procedure outlined in the last
> paragraph of
> https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-glusterd/#debugging-glusterd
>
> Can anyone confirm that this is a good way to proceed and I won't be
> heading quickly towards corrupting volume data?


Check cluster.op-version, peer status, volume status output. If they are
all fine you’re good.


>
> Thanks,
> Riccardo
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
-- 
--Atin
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] cannot add server back to cluster after reinstallation

2019-03-27 Thread Karthik Subrahmanya
+Sanju Rakonde  & +Atin Mukherjee
 adding
glusterd folks who can help here.

On Wed, Mar 27, 2019 at 3:24 PM Riccardo Murri 
wrote:

> I managed to put the reinstalled server back into connected state with
> this procedure:
>
> 1. Run `for other_server in ...; do gluster peer probe $other_server;
> done` on the reinstalled server
> 2. Now all the peers on the reinstalled server show up as "Accepted
> Peer Request", which I fixed with the procedure outlined in the last
> paragraph of
> https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-glusterd/#debugging-glusterd
>
> Can anyone confirm that this is a good way to proceed and I won't be
> heading quickly towards corrupting volume data?
>
> Thanks,
> Riccardo
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] cannot add server back to cluster after reinstallation

2019-03-27 Thread Riccardo Murri
I managed to put the reinstalled server back into connected state with
this procedure:

1. Run `for other_server in ...; do gluster peer probe $other_server;
done` on the reinstalled server
2. Now all the peers on the reinstalled server show up as "Accepted
Peer Request", which I fixed with the procedure outlined in the last
paragraph of 
https://docs.gluster.org/en/v3/Troubleshooting/troubleshooting-glusterd/#debugging-glusterd

Can anyone confirm that this is a good way to proceed and I won't be
heading quickly towards corrupting volume data?

Thanks,
Riccardo
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] cannot add server back to cluster after reinstallation

2019-03-27 Thread riccardo . murri
Hello,

a couple days ago, the OS disk of one of the server of a local GlusterFS
cluster suffered a bad crash, and I had to reinstall everything from
scratch.

However, when I restart the GlusterFS service on the server that has
been reinstalled, I see that it sends back a "RJT" response to other
servers of the cluster, which then list it as "State: Peer Rejected
(Connected)"; the reinstalled server instead shows "Number of peers: 0".
The DEBUG level log on the reinstalled machine shows these lines after
the peer probe from another server in the cluster:

I [MSGID: 106490] 
[glusterd-handler.c:2540:__glusterd_handle_incoming_friend_req] 0-glusterd: 
Received probe from uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318
D [MSGID: 0] [glusterd-peer-utils.c:208:glusterd_peerinfo_find_by_uuid] 
0-management: Friend with uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318, not found
D [MSGID: 0] [glusterd-peer-utils.c:234:glusterd_peerinfo_find] 
0-management: Unable to find peer by uuid: 9a5763d2-1941-4e5d-8d33-8d6756f7f318
D [MSGID: 0] [glusterd-peer-utils.c:132:glusterd_peerinfo_find_by_hostname] 
0-management: Unable to find friend: glusterfs-server-004
D [MSGID: 0] [glusterd-peer-utils.c:246:glusterd_peerinfo_find] 
0-management: Unable to find hostname: glusterfs-server-004
I [MSGID: 106493] [glusterd-handler.c:3800:glusterd_xfer_friend_add_resp] 
0-glusterd: Responded to glusterfs-server-004 (24007), ret: 0, op_ret: -1

What can I do to re-add the reinstalled server into the cluster?  Is it
safe (= keeps data) to "peer detach" it and then "peer probe" again?

Additional info:

* The actual GlusterFS brick data was on a different disk and so is safe
  and mounted back in the original location.

* I copied back the `/etc/glusterfs/glusterd.vol` from the other servers
  in the cluster and restored the UUID into
  `/var/lib/glusterfs/glusterd.info`

* I have checked that `max.op-version` is the same on all servers of the
  cluster, including the reinstalled one.

* All servers run Ubuntu 16.04

Thanks for any suggestion!

Riccardo
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users