On 05/01/2014 11:22 PM, CJ Beck wrote:
Ok, I have found a way to get back to "ChangeLog"... This might be related to
the similar thread that we have going regarding the method for setting up the initial
geo-replication session. Seems as though when geo-repliation is set up on my cluster, it
tried to open the changelog fifo, but it wasn't there.
In order to fix this, I had to do the following:
* Stop geo-replication
* Stop volume
* Start volume
* Change geo-replication "change_detector" to changelog
* Start geo-replication
Once I did that, it went to Hybrid mode first, then changed to ChangeLog mode.
That's correct. But the question is why was change-logging unavailable.
The socket is created when changelog on turned on (done by
geo-replication on a start).
There would be a initial one-shot hybrid crawl (not a full FS crawl) and
then a switch over to use change-logging. This happens on geo-rep restart.
-CJ
From: CJ Beck <chris.b...@workday.com<mailto:chris.b...@workday.com>>
Date: Thursday, May 1, 2014 at 10:28 AM
To: Venky Shankar <yknev.shan...@gmail.com<mailto:yknev.shan...@gmail.com>>
Cc: "gluster-users@gluster.org<mailto:gluster-users@gluster.org>"
<gluster-users@gluster.org<mailto:gluster-users@gluster.org>>
Subject: Re: [Gluster-users] Question about geo-replication and deletes in 3.5
beta train
I just noticed this, which might be related to the change to xsync?
[root@dev604 eafea2c974a3c29ecfbf48cea274dc23]# more changes.log
[2014-04-30 15:45:27.807181] I
[gf-changelog.c:179:gf_changelog_notification_init] 0-glusterfs: connecting to
changelog socket:
/var/run/gluster/changelog-eafea2c974a3c29ecfbf48cea274dc23.sock (brick:
/data/sac-poc)
[2014-04-30 15:45:27.807257] W
[gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection
attempt 1/5...
[2014-04-30 15:45:29.807404] W
[gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection
attempt 2/5...
[2014-04-30 15:45:31.807607] W
[gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection
attempt 3/5...
[2014-04-30 15:45:33.807818] W
[gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection
attempt 4/5...
[2014-04-30 15:45:35.808038] W
[gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs: connection
attempt 5/5...
[2014-04-30 15:45:37.808239] E
[gf-changelog.c:204:gf_changelog_notification_init] 0-glusterfs: could not
connect to changelog socket! bailing out...
From: CJ Beck <chris.b...@workday.com<mailto:chris.b...@workday.com>>
Date: Wednesday, April 30, 2014 at 2:50 PM
To: Venky Shankar <yknev.shan...@gmail.com<mailto:yknev.shan...@gmail.com>>
Cc: "gluster-users@gluster.org<mailto:gluster-users@gluster.org>"
<gluster-users@gluster.org<mailto:gluster-users@gluster.org>>
Subject: Re: [Gluster-users] Question about geo-replication and deletes in 3.5
beta train
I just got back to testing this, and for some reason on my "freshly" created cluster and
geo-replication session, it's defaulting to "Hybrid Mode". It also keeps bouncing back to
xsync as the change method (it seems).
Geo-replication log:
[root@dev604 gluster-poc]# egrep -i 'changelog|xsync' *
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:45:27.763072] I [master(/data/gluster-poc):58:gmaster_builder] <top>:
setting up xsync change detection mode
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:45:27.765294] I [master(/data/gluster-poc):58:gmaster_builder] <top>:
setting up changelog change detection mode
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:45:27.768302] I [master(/data/gluster-poc):1103:register] _GMaster: xsync
temp directory:
/var/run/gluster/gluster-poc/ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc/eafea2c974a3c29ecfbf48cea274dc23/xsync
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:45:37.808617] I [master(/data/gluster-poc):682:fallback_xsync] _GMaster:
falling back to xsync mode
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:45:52.113879] I [master(/data/gluster-poc):58:gmaster_builder] <top>:
setting up xsync change detection mode
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:45:52.116525] I [master(/data/gluster-poc):58:gmaster_builder] <top>:
setting up xsync change detection mode
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:45:52.120129] I [master(/data/gluster-poc):1103:register] _GMaster: xsync
temp directory:
/var/run/gluster/gluster-poc/ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc/eafea2c974a3c29ecfbf48cea274dc23/xsync
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:45:52.120604] I [master(/data/gluster-poc):1103:register] _GMaster: xsync
temp directory:
/var/run/gluster/gluster-poc/ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc/eafea2c974a3c29ecfbf48cea274dc23/xsync
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:45:54.146847] I [master(/data/gluster-poc):1133:crawl] _GMaster: processing
xsync changelog
/var/run/gluster/gluster-poc/ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc/eafea2c974a3c29ecfbf48cea274dc23/xsync/XSYNC-CHANGELOG.1398872752
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:47:08.204514] I [master(/data/gluster-poc):58:gmaster_builder] <top>:
setting up xsync change detection mode
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:47:08.206767] I [master(/data/gluster-poc):58:gmaster_builder] <top>:
setting up xsync change detection mode
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:47:08.210570] I [master(/data/gluster-poc):1103:register] _GMaster: xsync
temp directory:
/var/run/gluster/gluster-poc/ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc/eafea2c974a3c29ecfbf48cea274dc23/xsync
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:47:08.211069] I [master(/data/gluster-poc):1103:register] _GMaster: xsync
temp directory:
/var/run/gluster/gluster-poc/ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc/eafea2c974a3c29ecfbf48cea274dc23/xsync
ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc.log:[2014-04-30
15:47:09.247109] I [master(/data/gluster-poc):1133:crawl] _GMaster: processing
xsync changelog
/var/run/gluster/gluster-poc/ssh%3A%2F%2Froot%4010.10.10.120%3Agluster%3A%2F%2F127.0.0.1%3Agluster-poc/eafea2c974a3c29ecfbf48cea274dc23/xsync/XSYNC-CHANGELOG.1398872828
[root@dev604 gluster-poc]# gluster volume geo-replication gluster-poc
10.10.10.120::gluster-poc status detail
MASTER NODE MASTER VOL MASTER BRICK SLAVE
STATUS CHECKPOINT STATUS CRAWL STATUS FILES SYNCD FILES PENDING
BYTES PENDING DELETES PENDING FILES SKIPPED
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
dev604.domain.com gluster-poc /data/gluster-poc
10.10.10.120::gluster-poc Active N/A Hybrid Crawl 0
323 0 0 0
dev606.domain.com gluster-poc /data/gluster-poc
10.10.10.122::gluster-poc Passive N/A N/A 0
0 0 0 0
dev605.domain.com gluster-poc /data/gluster-poc
10.10.10.121::gluster-poc Passive N/A N/A 0
0 0 0 0
From: Venky Shankar <yknev.shan...@gmail.com<mailto:yknev.shan...@gmail.com>>
Date: Wednesday, April 23, 2014 at 12:09 PM
To: CJ Beck <chris.b...@workday.com<mailto:chris.b...@workday.com>>
Cc: "gluster-users@gluster.org<mailto:gluster-users@gluster.org>"
<gluster-users@gluster.org<mailto:gluster-users@gluster.org>>
Subject: Re: [Gluster-users] Question about geo-replication and deletes in 3.5
beta train
That should not happen. After a replica failover the "now" active node should continue
where the "old" active node left off.
Could you provide geo-replication logs from master and slave after reproducing
this (with changelog mode).
Thanks,
-venky
On Thu, Apr 17, 2014 at 9:34 PM, CJ Beck
<chris.b...@workday.com<mailto:chris.b...@workday.com>> wrote:
I did set it intentionally because I found a case where files would be missed during
geo-replication. Xsync seemed to handle the case better. The issue was when you bring the
"Active" node down that is handling the geo-replication session, and it's set
to ChangeLog as the change method. Any files that are written into the cluster while
geo-replication is down (eg, while the geo-replication session is being failed to another
node), are missed / skipped, and won't ever be transferred to the other cluster.
Is this the expected behavior? If not, then I can open a bug on it.
-CJ
From: Venky Shankar <yknev.shan...@gmail.com<mailto:yknev.shan...@gmail.com>>
Date: Wednesday, April 16, 2014 at 4:43 PM
To: CJ Beck <chris.b...@workday.com<mailto:chris.b...@workday.com>>
Cc: "gluster-users@gluster.org<mailto:gluster-users@gluster.org>"
<gluster-users@gluster.org<mailto:gluster-users@gluster.org>>
Subject: Re: [Gluster-users] Question about geo-replication and deletes in 3.5
beta train
On Thu, Apr 17, 2014 at 3:01 AM, CJ Beck
<chris.b...@workday.com<mailto:chris.b...@workday.com>> wrote:
I did have the "change_detector" set to xsync, which seems to be the issue
(bypassing the changelog method). So I can fix that and see if the deletes are propagated.
Was that set intentionally? Setting this as the main change detection mechanism
would crawl the filesystem every 60 seconds to replicate the changes. Changelog
mode handles live changes, so any deletes that were performed before this
option was set would not be propagated.
Also, is there a way to tell the geo-replication to go ahead and walk the filesystems to
do a "sync" so the remote side files are deleted, if they are not on the source?
As of now, no. With distributed geo-replication, the geo-rep daemon crawls the
bricks (instead of the mount). Since the brick would have a subset of the file
system entities (for e.g. in a distributed volume), it's hard to find out
purged entries without having to crawl the mount and comparing the entries b/w
master and slave (which is slow). This is where changelog mode helps.
Thanks for the quick reply!
[root@host ~]# gluster volume geo-replication test-poc 10.10.1.120::test-poc
status detail
MASTER NODE MASTER VOL MASTER BRICK SLAVE
STATUS CHECKPOINT STATUS CRAWL STATUS FILES SYNCD FILES PENDING
BYTES PENDING DELETES PENDING FILES SKIPPED
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
host1.com<http://host1.com> test-poc /data/test-poc
10.10.1.120::test-poc Passive N/A N/A 382
0 0 0 0
host2.com<http://host2.com> test-poc /data/test-poc
10.10.1.122::test-poc Passive N/A N/A 0
0 0 0 0
host3.com<http://host3.com> test-poc /data/test-poc
10.10.1.121::test-poc Active N/A Hybrid Crawl 10765
70 0 0 0
From: Venky Shankar <yknev.shan...@gmail.com<mailto:yknev.shan...@gmail.com>>
Date: Wednesday, April 16, 2014 at 1:54 PM
To: CJ Beck <chris.b...@workday.com<mailto:chris.b...@workday.com>>
Cc: "gluster-users@gluster.org<mailto:gluster-users@gluster.org>"
<gluster-users@gluster.org<mailto:gluster-users@gluster.org>>
Subject: Re: [Gluster-users] Question about geo-replication and deletes in 3.5
beta train
"ignore-deletes" is only valid in the initial crawl mode[1] where it does not
propagate deletes to the slave (changelog mode does). Was the session restarted by any
chance?
[1] Geo-replication now has two internal operations modes: a one shot
filesystem crawl mode (used to replicate data already present in a volume) and
the changelog mode (for replicating live changes).
Thanks,
-venky
On Thu, Apr 17, 2014 at 1:25 AM, CJ Beck
<chris.b...@workday.com<mailto:chris.b...@workday.com>> wrote:
I have an issue where deletes are not being propagated to the slave cluster in
a geo-replicated environment. I've looked through the code, and it appears as
though this is something that might have been changed to be hard coded?
When I try to change it via a config option on the command line, it replies with a
"reserved option" error:
[root@host ~]# gluster volume geo-replication test-poc 10.10.1.120::test-poc
config ignore_deletes 1
Reserved option
geo-replication command failed
[root@host ~]# gluster volume geo-replication test-poc 10.10.1.120::test-poc
config ignore-deletes 1
Reserved option
geo-replication command failed
[root@host ~]#
Looking at the source code (although, I'm not a C expert by any means), it seems as
though it's hard-coded to be "true" all the time?
(from glusterd-geo-rep.c):
4285 /* ignore-deletes */
4286 runinit_gsyncd_setrx (&runner, conf_path);
4287 runner_add_args (&runner, "ignore-deletes", "true", ".", ".",
NULL);
4288 RUN_GSYNCD_CMD;
Any ideas how to get deletes propagated to the slave cluster?
Thanks!
-CJ
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org<mailto:Gluster-users@gluster.org>
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users