Hi David,

It seems like a network issue to me, As it's unable to connect the other node 
and getting timeout.

Few things you can check-

  *   Check the /etc/hosts file on both the servers and make sure it has the 
correct IP of the other node.
  *   Are you binding gluster on any specific IP, which is changed after your 
update.
  *   Check if you can access port 24007 from the other host.

If all the above checks are fine then you can try to stop the glusterd on both 
the nodes(sg first and then br), and make sure there is no gluster related 
process left, then start gluster on br first and check the `gluster peer 
status` then start gluster on sg.  ( If you can take the downtime 🙂 )

Thanks,
Anant
________________________________
From: Gluster-users <[email protected]> on behalf of David 
Cunningham <[email protected]>
Sent: 23 February 2023 9:56 PM
To: gluster-users <[email protected]>
Subject: Re: [Gluster-users] Big problems after update to 9.6


EXTERNAL: Do not click links or open attachments if you do not recognize the 
sender.

Is it possible that version 9.1 and 9.6 can't talk to each other? My 
understanding was that they should be able to.


On Fri, 24 Feb 2023 at 10:36, David Cunningham 
<[email protected]<mailto:[email protected]>> wrote:
We've tried to remove "sg" from the cluster so we can re-install the GlusterFS 
node on it, but the following command run on "br" also gives a timeout error:

gluster volume remove-brick gvol0 replica 1 sg:/nodirectwritedata/gluster/gvol0 
force

How can we tell "br" to just remove "sg" without trying to contact it?


On Fri, 24 Feb 2023 at 10:31, David Cunningham 
<[email protected]<mailto:[email protected]>> wrote:
Hello,

We have a cluster with two nodes, "sg" and "br", which were running GlusterFS 
9.1, installed via the Ubuntu package manager. We updated the Ubuntu packages 
on "sg" to version 9.6, and now have big problems. The "br" node is still on 
version 9.1.

Running "gluster volume status" on either host gives "Error : Request timed 
out". On "sg" not all processes are running, compared to "br", as below. 
Restarting the services on "sg" doesn't help. Can anyone advise how we should 
proceed? This is a production system.

root@sg:~# ps -ef | grep gluster
root     15196     1  0 22:37 ?        00:00:00 /usr/sbin/glusterd -p 
/var/run/glusterd.pid --log-level INFO
root     15426     1  0 22:39 ?        00:00:00 /usr/bin/python3 
/usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root     15457 15426  0 22:39 ?        00:00:00 /usr/bin/python3 
/usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root     19341 13695  0 23:24 pts/1    00:00:00 grep --color=auto gluster

root@br:~# ps -ef | grep gluster
root      2052     1  0  2022 ?        00:00:00 /usr/bin/python3 
/usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root      2062     1  3  2022 ?        10-11:57:16 /usr/sbin/glusterfs 
--fuse-mountopts=noatime --process-name fuse --volfile-server=br 
--volfile-server=sg --volfile-id=/gvol0 --fuse-mountopts=noatime /mnt/glusterfs
root      2379  2052  0  2022 ?        00:00:00 /usr/bin/python3 
/usr/sbin/glustereventsd --pid-file /var/run/glustereventsd.pid
root      5884     1  5  2022 ?        18-16:08:53 /usr/sbin/glusterfsd -s br 
--volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p 
/var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S 
/var/run/gluster/61df1d4e1c65300e.socket --brick-name 
/nodirectwritedata/gluster/gvol0 -l 
/var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log --xlator-option 
*-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6 --process-name brick 
--brick-port 49152 --xlator-option gvol0-server.listen-port=49152
root     10463 18747  0 23:24 pts/1    00:00:00 grep --color=auto gluster
root     27744     1  0  2022 ?        03:55:10 /usr/sbin/glusterfsd -s br 
--volfile-id gvol0.br.nodirectwritedata-gluster-gvol0 -p 
/var/run/gluster/vols/gvol0/br-nodirectwritedata-gluster-gvol0.pid -S 
/var/run/gluster/61df1d4e1c65300e.socket --brick-name 
/nodirectwritedata/gluster/gvol0 -l 
/var/log/glusterfs/bricks/nodirectwritedata-gluster-gvol0.log --xlator-option 
*-posix.glusterd-uuid=11e528b0-8c69-4b5d-82ed-c41dd25536d6 --process-name brick 
--brick-port 49153 --xlator-option gvol0-server.listen-port=49153
root     48227     1  0 Feb17 ?        00:00:26 /usr/sbin/glusterd -p 
/var/run/glusterd.pid --log-level INFO

On "sg" in glusterd.log we're seeing:

[2023-02-23 20:26:57.619318 +0000] E [rpc-clnt.c:181:call_bail] 0-management: 
bailing out frame type(glusterd mgmt v3), op(--(6)), xid = 0x11, unique = 27, 
sent = 2023-02-23 20:16:50.596447 +0000, timeout = 600 for 
10.20.20.11:24007<https://urldefense.com/v3/__http://10.20.20.11:24007__;!!I_DbfM1H!H-ob27qPp9fpvcacuvx-Rq_m9Rdc7w0qO3r5pewwZCO30JJzs4eTic2nPJo3JaeCgJanX84-S_Iv80eiwJIScXmwbFpt--19$>
[2023-02-23 20:26:57.619425 +0000] E [MSGID: 106115] 
[glusterd-mgmt.c:122:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed 
on br. Please check log file for details.
[2023-02-23 20:26:57.619545 +0000] E [MSGID: 106151] 
[glusterd-syncop.c:1655:gd_unlock_op_phase] 0-management: Failed to unlock on 
some peer(s)
[2023-02-23 20:26:57.619693 +0000] W 
[glusterd-locks.c:817:glusterd_mgmt_v3_unlock] 
(-->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe19b9) 
[0x7fadf47fa9b9] 
-->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe0e20) 
[0x7fadf47f9e20] 
-->/usr/lib/x86_64-linux-gnu/glusterfs/9.6/xlator/mgmt/glusterd.so(+0xe7904) 
[0x7fadf4800904] ) 0-management: Lock owner mismatch. Lock for vol gvol0 held 
by 11e528b0-8c69-4b5d-82ed-c41dd25536d6
[2023-02-23 20:26:57.619780 +0000] E [MSGID: 106117] 
[glusterd-syncop.c:1679:gd_unlock_op_phase] 0-management: Unable to release 
lock for gvol0
[2023-02-23 20:26:57.619939 +0000] I [socket.c:3811:socket_submit_outgoing_msg] 
0-socket.management: not connected (priv->connected = -1)
[2023-02-23 20:26:57.619969 +0000] E [rpcsvc.c:1567:rpcsvc_submit_generic] 
0-rpc-service: failed to submit message (XID: 0x3, Program: GlusterD svc cli, 
ProgVers: 2, Proc: 27) to rpc-transport (socket.management)
[2023-02-23 20:26:57.619995 +0000] E [MSGID: 106430] 
[glusterd-utils.c:678:glusterd_submit_reply] 0-glusterd: Reply submission failed

And in the brick log:

[2023-02-23 20:22:56.717721 +0000] I [addr.c:54:compare_addr_and_update] 
0-/nodirectwritedata/gluster/gvol0: allowed = "*", received addr = "10.20.20.11"
[2023-02-23 20:22:56.717817 +0000] I [login.c:110:gf_auth] 0-auth/login: 
allowed user names: a26c7de4-1236-4e0a-944a-cb82de7f7f0e
[2023-02-23 20:22:56.717840 +0000] I [MSGID: 115029] 
[server-handshake.c:561:server_setvolume] 0-gvol0-server: accepted client from 
CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0
 (version: 9.1) with subvol /nodirectwritedata/gluster/gvol0
[2023-02-23 20:22:56.741545 +0000] W [socket.c:766:__socket_rwv] 
0-tcp.gvol0-server: readv on 
10.20.20.11:49144<https://urldefense.com/v3/__http://10.20.20.11:49144__;!!I_DbfM1H!H-ob27qPp9fpvcacuvx-Rq_m9Rdc7w0qO3r5pewwZCO30JJzs4eTic2nPJo3JaeCgJanX84-S_Iv80eiwJIScXmwbE2fA_l4$>
 failed (No data available)
[2023-02-23 20:22:56.741599 +0000] I [MSGID: 115036] 
[server.c:500:server_rpc_notify] 0-gvol0-server: disconnecting connection 
[{client-uid=CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0}]
[2023-02-23 20:22:56.741866 +0000] I [MSGID: 101055] 
[client_t.c:397:gf_client_unref] 0-gvol0-server: Shutting down connection 
CTX_ID:46b23c19-5114-4a20-9306-9ea6faf02d51-GRAPH_ID:0-PID:35568-HOST:br.m5voip.com-PC_NAME:gvol0-client-0-RECON_NO:-0

Thanks for your help,

--
David Cunningham, Voisonics Limited
http://voisonics.com/<https://urldefense.com/v3/__http://voisonics.com/__;!!I_DbfM1H!H-ob27qPp9fpvcacuvx-Rq_m9Rdc7w0qO3r5pewwZCO30JJzs4eTic2nPJo3JaeCgJanX84-S_Iv80eiwJIScXmwbO4CuFD8$>
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782


--
David Cunningham, Voisonics Limited
http://voisonics.com/<https://urldefense.com/v3/__http://voisonics.com/__;!!I_DbfM1H!H-ob27qPp9fpvcacuvx-Rq_m9Rdc7w0qO3r5pewwZCO30JJzs4eTic2nPJo3JaeCgJanX84-S_Iv80eiwJIScXmwbO4CuFD8$>
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782


--
David Cunningham, Voisonics Limited
http://voisonics.com/<https://urldefense.com/v3/__http://voisonics.com/__;!!I_DbfM1H!H-ob27qPp9fpvcacuvx-Rq_m9Rdc7w0qO3r5pewwZCO30JJzs4eTic2nPJo3JaeCgJanX84-S_Iv80eiwJIScXmwbO4CuFD8$>
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782

DISCLAIMER: This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error, please notify the sender. 
This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee, you should not 
disseminate, distribute or copy this email. Please notify the sender 
immediately by email if you have received this email by mistake and delete this 
email from your system.

If you are not the intended recipient, you are notified that disclosing, 
copying, distributing or taking any action in reliance on the contents of this 
information is strictly prohibited. Thanks for your cooperation.
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to