Re: [Gluster-users] Geo-Replication Issue while upgrading

Sunny Kumar Thu, 28 Nov 2019 04:35:01 -0800

Hi Deepu,

Can you try this:


ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i
/var/lib/glusterd/geo-replication/secret.pem -p 22
sas@192.168.185.118 "sudo gluster volume status"

/sunny


On Thu, Nov 28, 2019 at 12:14 PM deepu srinivasan <sdeep...@gmail.com> wrote:
>>
>> MASTER NODE        MASTER VOL    MASTER BRICK                        SLAVE 
>> USER    SLAVE                             SLAVE NODE         STATUS     
>> CRAWL STATUS    LAST_SYNCED
>>
>> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>> 192.168.185.89     code-misc     /home/sas/gluster/data/code-misc    sas     
>>       sas@192.168.185.118::code-misc    N/A                Faulty     N/A    
>>          N/A
>>
>> 192.168.185.101    code-misc     /home/sas/gluster/data/code-misc    sas     
>>       sas@192.168.185.118::code-misc    192.168.185.118    Passive    N/A    
>>          N/A
>>
>> 192.168.185.93     code-misc     /home/sas/gluster/data/code-misc    sas     
>>       sas@192.168.185.118::code-misc    N/A                Faulty     N/A    
>>          N/A
>
>
> On Thu, Nov 28, 2019 at 5:43 PM deepu srinivasan <sdeep...@gmail.com> wrote:
>>
>> I Think its configured properly. Should i check something else..
>>
>> root@192.168.185.89/var/log/glusterfs#ssh sas@192.168.185.118 "sudo gluster 
>> volume info"
>>
>> **************************************************************************************************************************
>>
>> WARNING: This system is a restricted access system.  All activity on this 
>> system is subject to monitoring.  If information collected reveals possible 
>> criminal activity or activity that exceeds privileges, evidence of such 
>> activity may be providedto the relevant authorities for further action.
>>
>> By continuing past this point, you expressly consent to   this monitoring.-
>>
>> **************************************************************************************************************************
>>
>>
>>
>> Volume Name: code-misc
>>
>> Type: Replicate
>>
>> Volume ID: e9b6fbed-fcd0-42a9-ab11-02ec39c2ee07
>>
>> Status: Started
>>
>> Snapshot Count: 0
>>
>> Number of Bricks: 1 x 3 = 3
>>
>> Transport-type: tcp
>>
>> Bricks:
>>
>> Brick1: 192.168.185.118:/home/sas/gluster/data/code-misc
>>
>> Brick2: 192.168.185.45:/home/sas/gluster/data/code-misc
>>
>> Brick3: 192.168.185.84:/home/sas/gluster/data/code-misc
>>
>> Options Reconfigured:
>>
>> features.read-only: enable
>>
>> transport.address-family: inet
>>
>> nfs.disable: on
>>
>> performance.client-io-threads: off
>>
>>
>> On Thu, Nov 28, 2019 at 5:40 PM Sunny Kumar <sunku...@redhat.com> wrote:
>>>
>>> Hi Deepu,
>>>
>>> Looks like this is error generated due to ssh restrictions:
>>> Can you please check and confirm ssh is properly configured?
>>>
>>>
>>> 2019-11-28 11:59:12.934436] E [syncdutils(worker
>>> /home/sas/gluster/data/code-misc):809:logerr] Popen: ssh>
>>> **************************************************************************************************************************
>>>
>>> [2019-11-28 11:59:12.934703] E [syncdutils(worker
>>> /home/sas/gluster/data/code-misc):809:logerr] Popen: ssh> WARNING:
>>> This system is a restricted access system.  All activity on this
>>> system is subject to monitoring.  If information collected reveals
>>> possible criminal activity or activity that exceeds privileges,
>>> evidence of such activity may be providedto the relevant authorities
>>> for further action.
>>>
>>> [2019-11-28 11:59:12.934967] E [syncdutils(worker
>>> /home/sas/gluster/data/code-misc):809:logerr] Popen: ssh> By
>>> continuing past this point, you expressly consent to   this
>>> monitoring.- ZOHO Corporation
>>>
>>> [2019-11-28 11:59:12.935194] E [syncdutils(worker
>>> /home/sas/gluster/data/code-misc):809:logerr] Popen: ssh>
>>> **************************************************************************************************************************
>>>
>>> 2019-11-28 11:59:12.944369] I [repce(agent
>>> /home/sas/gluster/data/code-misc):97:service_loop] RepceServer:
>>> terminating on reaching EOF.
>>>
>>> /sunny
>>>
>>> On Thu, Nov 28, 2019 at 12:03 PM deepu srinivasan <sdeep...@gmail.com> 
>>> wrote:
>>> >
>>> >
>>> >
>>> > ---------- Forwarded message ---------
>>> > From: deepu srinivasan <sdeep...@gmail.com>
>>> > Date: Thu, Nov 28, 2019 at 5:32 PM
>>> > Subject: Geo-Replication Issue while upgrading
>>> > To: gluster-users <gluster-users@gluster.org>
>>> >
>>> >
>>> > Hi Users/Developers
>>> > I hope you remember the last issue we faced regarding the geo-replication 
>>> > goes to the faulty state while stopping and starting the geo-replication.
>>> >>
>>> >> [2019-11-16 17:29:43.536881] I [gsyncdstatus(worker 
>>> >> /home/sas/gluster/data/code-misc6):281:set_active] GeorepStatus: Worker 
>>> >> Status Change       status=Active
>>> >> [2019-11-16 17:29:43.629620] I [gsyncdstatus(worker 
>>> >> /home/sas/gluster/data/code-misc6):253:set_worker_crawl_status] 
>>> >> GeorepStatus: Crawl Status Change   status=History Crawl
>>> >> [2019-11-16 17:29:43.630328] I [master(worker 
>>> >> /home/sas/gluster/data/code-misc6):1517:crawl] _GMaster: starting 
>>> >> history crawl   turns=1 stime=(1573924576, 0)   entry_stime=(1573924576, 
>>> >> 0)     etime=1573925383
>>> >> [2019-11-16 17:29:44.636725] I [master(worker 
>>> >> /home/sas/gluster/data/code-misc6):1546:crawl] _GMaster: slave's time    
>>> >>  stime=(1573924576, 0)
>>> >> [2019-11-16 17:29:44.778966] I [master(worker 
>>> >> /home/sas/gluster/data/code-misc6):898:fix_possible_entry_failures] 
>>> >> _GMaster: Fixing ENOENT error in slave. Parent does not exist on master. 
>>> >> Safe to ignore, take out entry       retry_count=1   entry=({'uid': 0, 
>>> >> 'gfid': 'c02519e0-0ead-4fe8-902b-dcae72ef83a3', 'gid': 0, 'mode': 33188, 
>>> >> 'entry': '.gfid/d60aa0d5-4fdf-4721-97dc-9e3e50995dab/368307802', 'op': 
>>> >> 'CREATE'}, 2, {'slave_isdir': False, 'gfid_mismatch': False, 
>>> >> 'slave_name': None, 'slave_gfid': None, 'name_mismatch': False, 'dst': 
>>> >> False})
>>> >> [2019-11-16 17:29:44.779306] I [master(worker 
>>> >> /home/sas/gluster/data/code-misc6):942:handle_entry_failures] _GMaster: 
>>> >> Sucessfully fixed entry ops with gfid mismatch    retry_count=1
>>> >> [2019-11-16 17:29:44.779516] I [master(worker 
>>> >> /home/sas/gluster/data/code-misc6):1194:process_change] _GMaster: Retry 
>>> >> original entries. count = 1
>>> >> [2019-11-16 17:29:44.879321] E [repce(worker 
>>> >> /home/sas/gluster/data/code-misc6):214:__call__] RepceClient: call 
>>> >> failed  call=151945:140353273153344:1573925384.78       method=entry_ops 
>>> >>        error=OSError
>>> >> [2019-11-16 17:29:44.879750] E [syncdutils(worker 
>>> >> /home/sas/gluster/data/code-misc6):338:log_raise_exception] <top>: FAIL:
>>> >> Traceback (most recent call last):
>>> >>   File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 322, 
>>> >> in main
>>> >>     func(args)
>>> >>   File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 82, 
>>> >> in subcmd_worker
>>> >>     local.service_loop(remote)
>>> >>   File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 
>>> >> 1277, in service_loop
>>> >>     g3.crawlwrap(oneshot=True)
>>> >>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 599, 
>>> >> in crawlwrap
>>> >>     self.crawl()
>>> >>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1555, 
>>> >> in crawl
>>> >>     self.changelogs_batch_process(changes)
>>> >>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1455, 
>>> >> in changelogs_batch_process
>>> >>     self.process(batch)
>>> >>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1290, 
>>> >> in process
>>> >>     self.process_change(change, done, retry)
>>> >>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1195, 
>>> >> in process_change
>>> >>     failures = self.slave.server.entry_ops(entries)
>>> >>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 233, in 
>>> >> __call__
>>> >>     return self.ins(self.meth, *a)
>>> >>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 215, in 
>>> >> __call__
>>> >>     raise res
>>> >> OSError: [Errno 13] Permission denied: 
>>> >> '/home/sas/gluster/data/code-misc6/.glusterfs/6a/90/6a9008b1-a4aa-4c30-9ae7-92a33e05d0bb'
>>> >> [2019-11-16 17:29:44.911767] I [repce(agent 
>>> >> /home/sas/gluster/data/code-misc6):97:service_loop] RepceServer: 
>>> >> terminating on reaching EOF.
>>> >> [2019-11-16 17:29:45.509344] I [monitor(monitor):278:monitor] Monitor: 
>>> >> worker died in startup phase     brick=/home/sas/gluster/data/code-misc6
>>> >> [2019-11-16 17:29:45.511806] I 
>>> >> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker 
>>> >> Status Change status=Faulty
>>> >
>>> >
>>> >
>>> >
>>> > Now after upgrading to 7.0 version from 5.6 we got an error in 
>>> > geo-replication.
>>> > Scenario:
>>> >
>>> > We had a 1x3 replication and distributed volume in each DC.
>>> > Both volumes are started and the geo-replication session is set up 
>>> > between them and the files are synched. Now the geo-replication session 
>>> > is deleted.
>>> > Started to upgrade to 7.0 for each server starting from the slave end. I 
>>> > followed this link --> 
>>> > https://docs.gluster.org/en/latest/Upgrade-Guide/upgrade_to_4.1/
>>> > After starting the glusterd process created a geo-replication again but 
>>> > ends up in a faulty state. Please find the logs
>>> >
>>> >> [2019-11-28 11:59:12.370255] I 
>>> >> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker 
>>> >> Status Change status=Initializing...
>>> >>
>>> >> [2019-11-28 11:59:12.370615] I [monitor(monitor):159:monitor] Monitor: 
>>> >> starting gsyncd worker brick=/home/sas/gluster/data/code-misc 
>>> >> slave_node=192.168.185.84
>>> >>
>>> >> [2019-11-28 11:59:12.445581] I [gsyncd(agent 
>>> >> /home/sas/gluster/data/code-misc):311:main] <top>: Using session config 
>>> >> file 
>>> >> path=/var/lib/glusterd/geo-replication/code-misc_192.168.185.118_code-misc/gsyncd.conf
>>> >>
>>> >> [2019-11-28 11:59:12.448383] I [changelogagent(agent 
>>> >> /home/sas/gluster/data/code-misc):72:__init__] ChangelogAgent: Agent 
>>> >> listining...
>>> >>
>>> >> [2019-11-28 11:59:12.453881] I [gsyncd(worker 
>>> >> /home/sas/gluster/data/code-misc):311:main] <top>: Using session config 
>>> >> file 
>>> >> path=/var/lib/glusterd/geo-replication/code-misc_192.168.185.118_code-misc/gsyncd.conf
>>> >>
>>> >> [2019-11-28 11:59:12.472862] I [resource(worker 
>>> >> /home/sas/gluster/data/code-misc):1386:connect_remote] SSH: Initializing 
>>> >> SSH connection between master and slave...
>>> >>
>>> >> [2019-11-28 11:59:12.933346] E [syncdutils(worker 
>>> >> /home/sas/gluster/data/code-misc):311:log_raise_exception] <top>: 
>>> >> connection to peer is broken
>>> >>
>>> >> [2019-11-28 11:59:12.934117] E [syncdutils(worker 
>>> >> /home/sas/gluster/data/code-misc):805:errlog] Popen: command returned 
>>> >> error cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i 
>>> >> /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto 
>>> >> -S /tmp/gsyncd-aux-ssh-tKcFQe/5697733f424862ab9d57e019de78aca6.sock 
>>> >> sas@192.168.185.84 /usr/libexec/glusterfs/gsyncd slave code-misc 
>>> >> sas@192.168.185.118::code-misc --master-node 192.168.185.89 
>>> >> --master-node-id a7a9688e-700c-4452-9cd6-e10d6eed5335 --master-brick 
>>> >> /home/sas/gluster/data/code-misc --local-node 192.168.185.84 
>>> >> --local-node-id cbafeca3-650b-4c9e-8ea6-2451ea9265dd --slave-timeout 120 
>>> >> --slave-log-level INFO --slave-gluster-log-level INFO 
>>> >> --slave-gluster-command-dir /usr/sbin --master-dist-count 3 error=1
>>> >>
>>> >> [2019-11-28 11:59:12.934436] E [syncdutils(worker 
>>> >> /home/sas/gluster/data/code-misc):809:logerr] Popen: ssh> 
>>> >> **************************************************************************************************************************
>>> >>
>>> >> [2019-11-28 11:59:12.934703] E [syncdutils(worker 
>>> >> /home/sas/gluster/data/code-misc):809:logerr] Popen: ssh> WARNING: This 
>>> >> system is a restricted access system.  All activity on this system is 
>>> >> subject to monitoring.  If information collected reveals possible 
>>> >> criminal activity or activity that exceeds privileges, evidence of such 
>>> >> activity may be providedto the relevant authorities for further action.
>>> >>
>>> >> [2019-11-28 11:59:12.934967] E [syncdutils(worker 
>>> >> /home/sas/gluster/data/code-misc):809:logerr] Popen: ssh> By continuing 
>>> >> past this point, you expressly consent to   this monitoring.- ZOHO 
>>> >> Corporation
>>> >>
>>> >> [2019-11-28 11:59:12.935194] E [syncdutils(worker 
>>> >> /home/sas/gluster/data/code-misc):809:logerr] Popen: ssh> 
>>> >> **************************************************************************************************************************
>>> >>
>>> >> [2019-11-28 11:59:12.944369] I [repce(agent 
>>> >> /home/sas/gluster/data/code-misc):97:service_loop] RepceServer: 
>>> >> terminating on reaching EOF.
>>> >>
>>> >> [2019-11-28 11:59:12.944722] I [monitor(monitor):280:monitor] Monitor: 
>>> >> worker died in startup phase brick=/home/sas/gluster/data/code-misc
>>> >>
>>> >> [2019-11-28 11:59:12.947575] I 
>>> >> [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker 
>>> >> Status Change status=Faulty
>>> >
>>> >
>>>

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Geo-Replication Issue while upgrading

Reply via email to