From: freeipa-users-boun...@redhat.com 
[mailto:freeipa-users-boun...@redhat.com] On Behalf Of Rich Megginson
Sent: Monday, October 27, 2014 6:42 AM
To: freeipa-users@redhat.com
Subject: Re: [Freeipa-users] multi-master replication

On 10/25/2014 06:17 PM, Dmitri Pal wrote:
On 10/24/2014 07:15 PM, Craig White wrote:
From: freeipa-users-boun...@redhat.com<mailto:freeipa-users-boun...@redhat.com> 
[mailto:freeipa-users-boun...@redhat.com] On Behalf Of Craig White
Sent: Friday, October 24, 2014 4:02 PM
To: freeipa-users@redhat.com<mailto:freeipa-users@redhat.com>
Subject: [Freeipa-users] multi-master replication

I would have thought that changes go from replica to master and not just master 
to replica.

Is there something I have to do to make the changes bi-directional?

Replying to my own post...
Logs are my friend  ;-)

[24/Oct/2014:23:08:17 +0000] NSMMReplicationPlugin - 
agmt="cn=meToipa001.domain.local " (ipa001:389): Replication bind with GSSAPI 
auth resumed
[24/Oct/2014:23:08:17 +0000] NSMMReplicationPlugin - 
agmt="cn=meToipa001.domain.local " (ipa001:389): Warning: unable to replicate 
schema: rc=2
[24/Oct/2014:23:08:17 +0000] NSMMReplicationPlugin - 
agmt="cn=meToipa001.domain.local " (ipa001:389): Failed to send update 
operation to consumer (uniqueid e018060f-5bb011e4-81078979-dc802980, CSN 
544aa346000000030000): Can't contact LDAP server. Will retry later.
[24/Oct/2014:23:08:17 +0000] NSMMReplicationPlugin - 
agmt="cn=meToipa001.domain.local " (ipa001:389): Consumer failed to replay 
change (uniqueid (null), CSN (null)): Can't contact LDAP server(-1). Will retry 
later.

These NULLs look suspicious.
I hope DS gurus will have more for you on Monday.

1) Yes, replication is fully bi-directional.
2) What are the exact versions of dirsrv?  rpm -q 389-ds-base on supplier and 
consumer.
3) Can you reproduce the problem using the replication log level on both the 
supplier and consumer?  
http://www.port389.org/docs/389ds/FAQ/faq.html#troubleshooting
----

1)      I sort of expected it to be - that's what had me worried when it didn't 
work as expected.

2)      ]# rpm -qa | grep 389

389-ds-base-1.2.11.15-47.el6.x86_64

389-ds-base-libs-1.2.11.15-47.el6.x86_64

3)      Upped the log levels - hopefully this is a reasonable representative 
from ipa002 (the replica)

[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - conn=4 op=229791 
repl="dc=stt,dc=local": Acquired replica
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - conn=4 op=229791 
repl="dc=stt,dc=local": StartNSDS90ReplicationRequest: response=0 rc=0
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - conn=4 op=229791 
Relinquishing consumer connection extension
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - ruv_add_csn_inprogress: 
successfully inserted csn 544e6b02000000040000 into pending list
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - Purged state information 
from entry 
krbprincipalname=ldap/ipa002.FQDN@FQDN,cn=services,cn=accounts,dc=stt,dc=local 
up to CSN 5445307e000000040000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - changelog program - 
_cl5GetDBFileByReplicaName: found DB object f32b80 for database 
/var/lib/dirsrv/slapd-STT-LOCAL/cldb/dababc08-5bb011e4-81078979-dc802980_544aa332000000040000.db4
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - changelog program - 
_cl5GetDBFileByReplicaName: found DB object f32b80 for database 
/var/lib/dirsrv/slapd-STT-LOCAL/cldb/dababc08-5bb011e4-81078979-dc802980_544aa332000000040000.db4
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - ruv_update_ruv: 
successfully committed csn 544e6b02000000040000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): Disconnected from the consumer
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): Warning: unable to replicate schema: rc=2
[27/Oct/2014:15:55:47 +0000] - csngen_adjust_time: gen state before 
544e6b040001:1414425347:0:1
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - changelog program - 
_cl5GetDBFile: found DB object f32b80 for database 
/var/lib/dirsrv/slapd-STT-LOCAL/cldb/dababc08-5bb011e4-81078979-dc802980_544aa332000000040000.db4
[27/Oct/2014:15:55:47 +0000] - _cl5PositionCursorForReplay 
(agmt="cn=meToipa001.FQDN" (ipa001:389)): Consumer RUV:
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): {replicageneration} 544aa332000000040000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): {replica 4 ldap://ipa001.FQDN:389} 544aa3a1000000040000 
544e6b02000000040000 00000000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): {replica 3 ldap://ipa002.FQDN:389} 544aa339000000030000 
544aa33f000400030000 00000000
[27/Oct/2014:15:55:47 +0000] - _cl5PositionCursorForReplay 
(agmt="cn=meToipa001.FQDN" (ipa001:389)): Supplier RUV:
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): {replicageneration} 544aa332000000040000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): {replica 3 ldap://ipa002.FQDN:389} 544aa339000000030000 
544e6aa4000300030000 544e6aa3
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): {replica 4 ldap://ipa001.FQDN:389} 544aa3a1000000040000 
544e6b02000000040000 544e6b03
[27/Oct/2014:15:55:47 +0000] agmt="cn=meToipa001.FQDN" (ipa001:389) - 
clcache_get_buffer: found thread private buffer cache 7fe85000ea50
[27/Oct/2014:15:55:47 +0000] agmt="cn=meToipa001.FQDN" (ipa001:389) - 
clcache_get_buffer: _pool is f89af0 _pool->pl_busy_lists is 7fe8500197d0 
_pool->pl_busy_lists->bl_buffers is 7fe85000ea50
[27/Oct/2014:15:55:47 +0000] agmt="cn=meToipa001.FQDN" (ipa001:389) - session 
start: anchorcsn=544aa33f000400030000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - changelog program - 
agmt="cn=meToipa001.FQDN" (ipa001:389): CSN 544aa33f000400030000 found, 
position set for replay
[27/Oct/2014:15:55:47 +0000] agmt="cn=meToipa001.FQDN" (ipa001:389) - load=1 
rec=1 csn=544aa346000000030000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): replay_update: Sending add operation 
(dn="cn=KDC,cn=ipa002.FQDN,cn=masters,cn=ipa,cn=etc,dc=stt,dc=local" 
csn=544aa346000000030000)
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): replay_update: Consumer could not replay operation with csn 
544aa346000000030000
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): Failed to send update operation to consumer (uniqueid 
e018060f-5bb011e4-81078979-dc802980, CSN 544aa346000000030000): Can't contact 
LDAP server. Will retry later.
[27/Oct/2014:15:55:47 +0000] - repl5_inc_result_threadmain starting
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): Connection disconnected by another thread
[27/Oct/2014:15:55:47 +0000] - repl5_inc_result_threadmain: read result for 
message_id 0
[27/Oct/2014:15:55:47 +0000] - repl5_inc_result_threadmain: result 3, -1, 2, 0, 
(null)
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): Consumer failed to replay change (uniqueid (null), CSN (null)): 
Can't contact LDAP server(-1). Will retry later.
[27/Oct/2014:15:55:47 +0000] - repl5_inc_result_threadmain: got op result 205 
should finish 1
[27/Oct/2014:15:55:47 +0000] - repl5_inc_result_threadmain exiting
[27/Oct/2014:15:55:47 +0000] agmt="cn=meToipa001.FQDN" (ipa001:389) - session 
end: state=0 load=1 sent=1 skipped=0 skipped_new_rid=0 
skipped_csn_gt_cons_maxcsn=0 skipped_up_to_date=0 skipped_csn_gt_ruv=0 
skipped_csn_covered=0
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): Warning: unable to send endReplication extended operation (Can't 
contact LDAP server)
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): State: sending_updates -> start_backoff
[27/Oct/2014:15:55:47 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): State: start_backoff -> start_backoff
[27/Oct/2014:15:55:49 +0000] NSMMReplicationPlugin - conn=4 op=229793 Acquired 
consumer connection extension
[27/Oct/2014:15:55:49 +0000] NSMMReplicationPlugin - conn=4 op=229793 
repl="dc=stt,dc=local": Released replica held by locking_purl=conn=4 id=229791
[27/Oct/2014:15:55:49 +0000] NSMMReplicationPlugin - conn=4 op=229793 
Relinquishing consumer connection extension
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): State: start_backoff -> backoff
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): Trying non-secure slapi_ldap_init_ext
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): binddn = ,  passwd =
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): Replication bind with GSSAPI auth resumed
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): No linger to cancel on the connection
[27/Oct/2014:15:55:51 +0000] - _csngen_adjust_local_time: gen state before 
544e6b040001:1414425347:0:1
[27/Oct/2014:15:55:51 +0000] - _csngen_adjust_local_time: gen state after 
544e6b080000:1414425351:0:1
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): Replica was successfully acquired.
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - agmt="cn=meToipa001.FQDN" 
(ipa001:389): State: backoff -> sending_updates
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - conn=4 op=229794 Acquired 
consumer connection extension
[27/Oct/2014:15:55:51 +0000] NSMMReplicationPlugin - conn=4 op=229794 
repl="dc=stt,dc=local": Begin incremental protocol
[27/Oct/2014:15:55:51 +0000] - csngen_adjust_time: gen state before 
544e6b080001:1414425351:0:1
[27/Oct/2014:15:55:51 +0000] - csngen_adjust_time: gen state after 
544e6b080001:1414425351:0:1

This is the log from ipa002 (which was the replica) and I am confused by the 
error logged, 'Can't contact LDAP server' because it surely can...
]# telnet 172.29.31.100 389
Trying 172.29.31.100...
Connected to 172.29.31.100.
Escape character is '^]'.
^]
telnet> quit
Connection closed.

This is a new setup (setup master on Thursday, imported users from OpenLDAP and 
then setup replica on Friday). Had a few issues that I didn't understand which 
may or may not be relevant. First, after I made some changes to SSSD to allow 
for sudo and hopefully HBAC, it seemed like the directory server crashed and I 
had to restart it. Seemed like the same thing happened at the same time when I 
likewise configured the replica. Also, when I was trying to setup the replica, 
I had to resort to -skip-conncheck as it (presume Kerberos) not only asked for 
the password for admin@$DOMAIN but also the password for 
admin@ipa001.$DOMAIN<mailto:admin@ipa001.$DOMAIN> for which I had never 
specifically setup any password and no password already used in Kerberos worked.

Thanks

Craig
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go To http://freeipa.org for more info on the project

Reply via email to