Hi @Aravinda<mailto:aravi...@kadalu.tech>,


I have checked the rsync version, and it's the same on primary and secondary 
nodes. We have rsync version 3.1.3, protocol version 31, on all servers. It's 
very strange that we have not made any changes, that we are aware of, and this 
geo-replication was working fine for the last 5 years, and suddenly it has 
stopped, and we are unable to understand the root cause of it.


I have checked the tcpdump and I can see that the master node is sending RST to 
the secondary node when geo-replication connects, but we are not seeing any RST 
when we do the ssh using the root user from master to secondary node ourselves, 
which makes me think that geo-replication is able to connect to the secondary 
node but after that, it's not liking something and tries to reset the 
connection, and this is repeating in a loop.


I have also enabled geo-replication debug logs and I am getting this error in 
the master node gsyncd logs.


[2024-02-07 22:37:36.820978] D [repce(worker 
/opt/tier1data2019/brick):195:push] RepceClient: call 
2563661:140414778891136:1707345456.8209238 entry_ops([{'op': 'CREATE', 
'skip_entry': False, 'gfid': '3d57e1e4-7bd2-44f6-a6d1-d628208b3697', 'entry': 
'.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8795785720233840105.docx',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 
'gfid': '3d57e1e4-7bd2-44f6-a6d1-d628208b3697', 'entry': 
'.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8795785720233840105.docx'},
 {'op': 'CREATE', 'skip_entry': False, 'gfid': 
'7bd35f91-1408-476d-869a-9936f2d94afc', 'entry': 
'.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/0c3fb22f-0fbe-4445-845b-9d94d84a9888',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 
'gfid': '3837018c-2f5e-43d4-ab58-0ed8b7456e73', 'entry': 
'.gfid/861afb81-386a-4b5b-af37-cef63a55a436/26fcd7e7-2c8c-4dcb-96f2-2c8a0d79f3d4',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 
'gfid': 'db311b10-b1e2-4b84-adea-a6746214aeda', 'entry': 
'.gfid/861afb81-386a-4b5b-af37-cef63a55a436/0526d0da-1f36-4203-8563-7e23aacf6237',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 
'gfid': '9bbb253a-226a-44b1-a968-7cfa76cf9463', 'entry': 
'.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeLLRenewalLetterDocusign_1_22_15_1_18_153.doc',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 
'gfid': '9bbb253a-226a-44b1-a968-7cfa76cf9463', 'entry': 
'.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeLLRenewalLetterDocusign_1_22_15_1_18_153.doc'},
 {'op': 'CREATE', 'skip_entry': False, 'gfid': 
'f62d0c65-6ede-48ff-b9bf-c44a33e5e023', 'entry': 
'.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/85530794-c15f-44d4-8660-87a14c2c9c8c',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 
'gfid': 'fd3d0af6-8ef5-4b76-bb47-0bc508df0ed0', 'entry': 
'.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeMOA_1_22_15_1_20_501.doc',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 
'gfid': 'fd3d0af6-8ef5-4b76-bb47-0bc508df0ed0', 'entry': 
'.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeMOA_1_22_15_1_20_501.doc'},
 {'op': 'CREATE', 'skip_entry': False, 'gfid': 
'e93c5771-9676-40d4-90cd-f0586ec05dd9', 'entry': 
'.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/cc372667-3b77-468f-bac6-671d4eb069e9',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 
'gfid': '02045f44-68ff-4a35-a843-08939afc46a4', 'entry': 
'.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeTTRenewalLetterASTNoFee-2022_1_22_15_1_19_530.doc',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 
'gfid': '02045f44-68ff-4a35-a843-08939afc46a4', 'entry': 
'.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/app_docmergeTTRenewalLetterASTNoFee-2022_1_22_15_1_19_530.doc'},
 {'op': 'CREATE', 'skip_entry': False, 'gfid': 
'6f5766c9-2dc3-4636-9041-9cf4ac64d26b', 'entry': 
'.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/556a0e3c-510d-4396-8f32-335aafec1314',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 
'gfid': 'f78561f0-c9f2-4192-a82a-8368e0ad8b2b', 'entry': 
'.gfid/ec161c2e-bb32-4639-a7b2-9be961221d86/app_1705935977525.tmp'}, {'op': 
'CREATE', 'skip_entry': False, 'gfid': 'd1e33edb-523e-41c1-a021-8bd3a5a2c7c0', 
'entry': 
'.gfid/e861ff10-696a-4b03-9716-39d9e7dd08d7/c655e3e5-9d4c-43d7-9171-949f01612e6d',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 
'gfid': 'b6f44b28-c2bf-4e70-b953-1c559ded7835', 'entry': 
'.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge7370453767656401681.docx',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 
'gfid': 'b6f44b28-c2bf-4e70-b953-1c559ded7835', 'entry': 
'.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge7370453767656401681.docx'},
 {'op': 'CREATE', 'skip_entry': False, 'gfid': 
'2d845d9e-7a49-4200-a100-759fe831ba0e', 'entry': 
'.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/84d47d84-5749-4a19-8f73-293078d17c63',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 
'gfid': '44554c17-21aa-427a-b796-7ecec6af2570', 'entry': 
'.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8634804987715893755.docx',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'CREATE', 'skip_entry': False, 
'gfid': '652bf5d7-3b7a-41d8-aa4f-e52296034821', 'entry': 
'.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/91a25682-69ea-4edc-9250-d6c7aac56853',
 'mode': 33188, 'uid': 0, 'gid': 0}, {'op': 'UNLINK', 'skip_entry': False, 
'gfid': '44554c17-21aa-427a-b796-7ecec6af2570', 'entry': 
'.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/app_docmerge8634804987715893755.docx'},
 {'op': 'CREATE', 'skip_entry': False, 'gfid': 
'04720811-b90e-42b7-a5d1-656afd92e245', 'entry': 
'.gfid/9a39167c-6c28-470a-b699-11eeaaff8edd/a66cbc42-61dc-4896-bb69-c715f1a820db',
 'mode': 33188, 'uid': 0, 'gid': 0}],) ...

[2024-02-07 22:37:36.909606] D [repce(worker 
/opt/tier1data2019/brick):215:__call__] RepceClient: call 
2563661:140414778891136:1707345456.8209238 entry_ops -> []
[2024-02-07 22:37:36.911032] D [master(worker 
/opt/tier1data2019/brick):317:a_syncdata] _GMaster: files 
[{files={'.gfid/652bf5d7-3b7a-41d8-aa4f-e52296034821', 
'.gfid/2d845d9e-7a49-4200-a100-759fe831ba0e', 
'.gfid/3837018c-2f5e-43d4-ab58-0ed8b7456e73', 
'.gfid/e93c5771-9676-40d4-90cd-f0586ec05dd9', 
'.gfid/f62d0c65-6ede-48ff-b9bf-c44a33e5e023', 
'.gfid/7bd35f91-1408-476d-869a-9936f2d94afc', 
'.gfid/04720811-b90e-42b7-a5d1-656afd92e245', 
'.gfid/6f5766c9-2dc3-4636-9041-9cf4ac64d26b', 
'.gfid/db311b10-b1e2-4b84-adea-a6746214aeda', 
'.gfid/d1e33edb-523e-41c1-a021-8bd3a5a2c7c0'}}]
[2024-02-07 22:37:36.911089] D [master(worker 
/opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing 
[{file=.gfid/652bf5d7-3b7a-41d8-aa4f-e52296034821}]
[2024-02-07 22:37:36.911133] D [master(worker 
/opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing 
[{file=.gfid/2d845d9e-7a49-4200-a100-759fe831ba0e}]
[2024-02-07 22:37:36.911169] D [master(worker 
/opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing 
[{file=.gfid/3837018c-2f5e-43d4-ab58-0ed8b7456e73}]
[2024-02-07 22:37:36.911202] D [master(worker 
/opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing 
[{file=.gfid/e93c5771-9676-40d4-90cd-f0586ec05dd9}]
[2024-02-07 22:37:36.911235] D [master(worker 
/opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing 
[{file=.gfid/f62d0c65-6ede-48ff-b9bf-c44a33e5e023}]
[2024-02-07 22:37:36.911268] D [master(worker 
/opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing 
[{file=.gfid/7bd35f91-1408-476d-869a-9936f2d94afc}]
[2024-02-07 22:37:36.911301] D [master(worker 
/opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing 
[{file=.gfid/04720811-b90e-42b7-a5d1-656afd92e245}]
[2024-02-07 22:37:36.911333] D [master(worker 
/opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing 
[{file=.gfid/6f5766c9-2dc3-4636-9041-9cf4ac64d26b}]
[2024-02-07 22:37:36.911366] D [master(worker 
/opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing 
[{file=.gfid/db311b10-b1e2-4b84-adea-a6746214aeda}]
[2024-02-07 22:37:36.911398] D [master(worker 
/opt/tier1data2019/brick):320:a_syncdata] _GMaster: candidate for syncing 
[{file=.gfid/d1e33edb-523e-41c1-a021-8bd3a5a2c7c0}]
[2024-02-07 22:37:36.911439] D [master(worker 
/opt/tier1data2019/brick):1344:process] _GMaster: processing change 
[{changelog=/var/lib/misc/gluster/gsyncd/tier1data_drtier1data_drtier1data/opt-tier1data2019-brick/.history/.processing/CHANGELOG.1705936007}]
[2024-02-07 22:37:36.915193] E [syncdutils(worker 
/opt/tier1data2019/brick):346:log_raise_exception] <top>: Gluster Mount process 
exited [{error=ENOTCONN}]
[2024-02-07 22:37:36.915252] E [syncdutils(worker 
/opt/tier1data2019/brick):363:log_raise_exception] <top>: FULL EXCEPTION TRACE:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 317, in main
    func(args)
  File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 86, in 
subcmd_worker
    local.service_loop(remote)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1298, in 
service_loop
    g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 604, in 
crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1614, in crawl
    self.changelogs_batch_process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1510, in 
changelogs_batch_process
    self.process(batch)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1345, in 
process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1071, in 
process_change
    st = lstat(pt)
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 589, in 
lstat
    return errno_wrap(os.lstat, [e], [ENOENT], [ESTALE, EBUSY])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 571, in 
errno_wrap
    return call(*arg)
OSError: [Errno 107] Transport endpoint is not connected: 
'.gfid/d53fad8f-84e9-4b24-9eb0-ccbcbdc4baa8'
[2024-02-07 22:37:37.344426] I [monitor(monitor):228:monitor] Monitor: worker 
died in startup phase [{brick=/opt/tier1data2019/brick}]
[2024-02-07 22:37:37.346601] I [gsyncdstatus(monitor):248:set_worker_status] 
GeorepStatus: Worker Status Change [{status=Faulty}]


Thanks,
Anant

________________________________
From: Aravinda <aravi...@kadalu.tech>
Sent: 07 February 2024 2:54 PM
To: Anant Saraswat <anant.saras...@techblue.co.uk>
Cc: Strahil Nikolov <hunter86...@yahoo.com>; gluster-users@gluster.org 
<gluster-users@gluster.org>
Subject: Re: [Gluster-users] __Geo-replication status is getting Faulty after 
few    seconds


EXTERNAL: Do not click links or open attachments if you do not recognize the 
sender.

It will keep track of last sync time if you change to non-root user. But I 
don't think the issue is related to root vs non-root user.

Even in non-root user based Geo-rep, Primary volume is mounted using root user 
only. Only in the secondary node, it will use Glusterd mountbroker to allow 
mounting the Secondary volume as non-priviliaged user.

Check the rsync version in Primary and secondary nodes. Please fix the versions 
if not matching.

--
Aravinda
Kadalu Technologies



---- On Wed, 07 Feb 2024 20:11:47 +0530 Anant Saraswat 
<anant.saras...@techblue.co.uk> wrote ---

No, It was setup and running using the root user only.

Do you think I should setup using  a dedicated non-root user? will it keep the 
track of old files or will it consider it as a new geo-replication and copy all 
the files from the scratch?

________________________________
From: Strahil Nikolov <hunter86...@yahoo.com<mailto:hunter86...@yahoo.com>>
Sent: 07 February 2024 2:36 PM
To: Anant Saraswat 
<anant.saras...@techblue.co.uk<mailto:anant.saras...@techblue.co.uk>>; Aravinda 
<aravi...@kadalu.tech<mailto:aravi...@kadalu.tech>>
Cc: gluster-users@gluster.org<mailto:gluster-users@gluster.org> 
<gluster-users@gluster.org<mailto:gluster-users@gluster.org>>
Subject: Re: [Gluster-users] __Geo-replication status is getting Faulty after 
few    seconds


EXTERNAL: Do not click links or open attachments if you do not recognize the 
sender.

Have you tried setting up gluster georep with a dedicated non-root user ?

Best Regards,
Strahil Nikolov

On Tue, Feb 6, 2024 at 16:38, Anant Saraswat
<anant.saras...@techblue.co.uk<mailto:anant.saras...@techblue.co.uk>> wrote:
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: 
https://meet.google.com/cpu-eiue-hvk<https://urldefense.com/v3/__https://meet.google.com/cpu-eiue-hvk__;!!I_DbfM1H!Dm8_fHcUmz5wnOfTdrkMSb6PXqGdC_3VpklsIdfjPuKgee_Ds7JD__1KjwR4F62a67f5292of5PyQVk9y3-TRe_00eSiJw$>
Gluster-users mailing list
Gluster-users@gluster.org<mailto:Gluster-users@gluster.org>
https://lists.gluster.org/mailman/listinfo/gluster-users<https://urldefense.com/v3/__https://lists.gluster.org/mailman/listinfo/gluster-users__;!!I_DbfM1H!Dm8_fHcUmz5wnOfTdrkMSb6PXqGdC_3VpklsIdfjPuKgee_Ds7JD__1KjwR4F62a67f5292of5PyQVk9y3-TRe-GwoljEQ$>


DISCLAIMER: This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error, please notify the sender. 
This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee, you should not 
disseminate, distribute or copy this email. Please notify the sender 
immediately by email if you have received this email by mistake and delete this 
email from your system.

If you are not the intended recipient, you are notified that disclosing, 
copying, distributing or taking any action in reliance on the contents of this 
information is strictly prohibited. Thanks for your cooperation.



DISCLAIMER: This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error, please notify the sender. 
This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee, you should not 
disseminate, distribute or copy this email. Please notify the sender 
immediately by email if you have received this email by mistake and delete this 
email from your system.

If you are not the intended recipient, you are notified that disclosing, 
copying, distributing or taking any action in reliance on the contents of this 
information is strictly prohibited. Thanks for your cooperation.
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Reply via email to