Hi Ludwig, > the fixes for the tickets you mention did change the iteration thru the > changelog and how it handles situtations when the start csn is not found in > the > changelog. and it also did change the logging, so you might see messages now > which were not there or hidden before. That was my understanding too.
> But I am very surprised to see them so frequently and I would like to > understand > it. > First some questions, do you have changelog trimming enabled and how, do you > have fractional replication ? yes for both questions. Trimming: 14 days Fractional replication: nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn nsds5ReplicaStripAttrs: modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp internalCreatorsname Changelog: cn=changelog5,cn=config objectClass: top objectClass: extensibleObject cn: changelog5 nsslapd-changelogdir: /Local/dirsrv/var/lib/dirsrv/slapd-ens/changelogdb nsslapd-changelogmaxage: 14d replica: cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping tree,cn=config objectClass: top objectClass: nsDS5Replica cn: replica nsDS5ReplicaId: 1 nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu nsDS5Flags: 1 nsDS5ReplicaBindDN: cn=RepliX,cn=config nsds5ReplicaPurgeDelay: 604800 nsds5ReplicaTombstonePurgeInterval: 86400 nsds5ReplicaLegacyConsumer: False nsDS5ReplicaType: 3 nsState:: AQAAAAAAAADCrc5XAAAAAAAAAAAAAAAAAQAAAAAAAAABAAAAAAAAAA== nsDS5ReplicaName: eeb6d304-736c11e6-9bc5a1ff-40280b8e nsds5ReplicaChangeCount: 114948 nsds5replicareapactive: 0 Typical replication agreement: cn=Replication from ldap-lab.<domain name> to ldap-adm.<domain name>,cn=replica,cn=dc\\3Did\\2Cdc\\3Dpolytechnique\\2Cdc\\3Dedu,cn=mapping tree,cn=config objectClass: top objectClass: nsDS5ReplicationAgreement cn: Replication from ldap-lab.<domain name> to ldap-adm.<domain name> description: Replication agreement from server ldap-lab.<domain name> to server ldap-adm.<domain name> nsDS5ReplicaHost: ldap-adm.<domain name> nsDS5ReplicaRoot: dc=id,dc=polytechnique,dc=edu nsDS5ReplicaPort: 636 nsDS5ReplicaTransportInfo: SSL nsDS5ReplicaBindDN: cn=RepliX,cn=config nsDS5ReplicaBindMethod: simple nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE entryusn memberOf nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn nsds5ReplicaStripAttrs: modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp internalCreatorsname nsds5replicaBusyWaitTime: 5 nsds5ReplicaFlowControlPause: 500 nsds5ReplicaFlowControlWindow: 1000 nsds5replicaTimeout: 120 nsDS5ReplicaCredentials: {AES-... nsds50ruv: {replicageneration} 57cd7377000000020000 nsds50ruv: {replica 2 ldap://ldap-adm.<domain name>:389} nsruvReplicaLastModified: {replica 2 ldap://ldap-adm.<domain name>:389} 00000000 nsds5replicareapactive: 0 nsds5replicaLastUpdateStart: 20160906115520Z nsds5replicaLastUpdateEnd: 20160906115520Z nsds5replicaChangesSentSinceStartup: 3:13525/670 1:3671/0 2:1/0 nsds5replicaLastUpdateStatus: 0 Replica acquired successfully: Incremental update succeeded nsds5replicaUpdateInProgress: FALSE nsds5replicaLastInitStart: 19700101000000Z nsds5replicaLastInitEnd: 19700101000000Z > Next, is it possible to get the access and error logs for a period of an hour > from all servers (you can send them off list) ? I would like to track some of > the reported csns. Sure, i will send it to you off list in a moment. Thank you, Regards, Andrey > Regards, > Ludwig > On 09/06/2016 12:31 PM, Ivanov Andrey (M.) wrote: >> Hi, >> We are successfully using the compiled 1.3.4 git branch of 389DS in >> production >> on CentOS 7 since about a year (approximately 40 000 entries, about 4000 >> groups, hundreds of reads and tens of writes per second). >> Our current topology consists of 3 servers in triangle (each server is a >> master >> replicating to 2 others, so two read-write replication agreements on each). >> Since the fixes for the Ticket 48766 ("Replication changelog can incorrectly >> skip over updates") and Ticket 48954 ("Replication fails because anchorcsn >> cannot be found") I’ve started to see the following regular warnings in error >> logs: >> [06/Sep/2016:01:21:43 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57cdfe06000100010000) not found for DB_NEXT >> [06/Sep/2016:01:21:43 +0200] agmt="cn=Replication from ldap-adm.<domain> to >> ldap-lab.<domain>" (ldap-lab:636) - Can't locate CSN 57cdfe06000100010000 in >> the changelog (DB rc=-30988). If replication stops, the consumer may need to >> be >> reinitialized. >> [06/Sep/2016:02:35:25 +0200] - replica_generate_next_csn: >> opcsn=57ce0f4e000500020000 <= basecsn=57ce0f4e000500030000, adjusted >> opcsn=57ce0f4e000600020000 >> [06/Sep/2016:04:10:11 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57ce257e000400030000) not found for DB_NEXT >> [06/Sep/2016:05:16:58 +0200] - replica_generate_next_csn: >> opcsn=57ce352b000000020000 <= basecsn=57ce352b000100010000, adjusted >> opcsn=57ce352b000100020000 >> [06/Sep/2016:06:56:04 +0200] agmt="cn=Replication from ldap-adm.<domain> to >> ldap-ens.<domain>" (ldap-ens:636) - Can't locate CSN 57ce4c62000100030000 in >> the changelog (DB rc=-30988). If replication stops, the consumer may need to >> be >> reinitialized. >> [06/Sep/2016:07:29:00 +0200] agmt="cn=Replication from ldap-adm.<domain> to >> ldap-ens.<domain>" (ldap-ens:636) - Can't locate CSN 57ce541a000200030000 in >> the changelog (DB rc=-30988). If replication stops, the consumer may need to >> be >> reinitialized. >> [06/Sep/2016:07:34:20 +0200] agmt="cn=Replication from ldap-adm.<domain> to >> ldap-lab.<domain>" (ldap-lab:636) - Can't locate CSN 57ce5559000100010000 in >> the changelog (DB rc=-30988). If replication stops, the consumer may need to >> be >> reinitialized. >> [06/Sep/2016:07:34:27 +0200] agmt="cn=Replication from ldap-adm.<domain> to >> ldap-lab.<domain>" (ldap-lab:636) - Can't locate CSN 57ce5561000000010000 in >> the changelog (DB rc=-30988). If replication stops, the consumer may need to >> be >> reinitialized. >> [06/Sep/2016:07:40:17 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57ce56c0000500030000) not found for DB_NEXT >> [06/Sep/2016:07:40:24 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57ce56c5000100030000) not found for DB_NEXT >> [06/Sep/2016:08:08:36 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57ce5d5f000f00010000) not found for DB_NEXT >> [06/Sep/2016:08:12:39 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57ce5e54000200030000) not found for DB_NEXT >> [06/Sep/2016:08:12:39 +0200] agmt="cn=Replication from ldap-adm.<domain> to >> ldap-ens.<domain>" (ldap-ens:636) - Can't locate CSN 57ce5e54000200030000 in >> the changelog (DB rc=-30988). If replication stops, the consumer may need to >> be >> reinitialized. >> [06/Sep/2016:08:26:45 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57ce61a3000200030000) not found for DB_NEXT >> [06/Sep/2016:08:27:40 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57ce61d8000200030000) not found for DB_NEXT >> [06/Sep/2016:08:27:40 +0200] agmt="cn=Replication from ldap-adm.<domain> to >> ldap-ens.<domain>" (ldap-ens:636) - Can't locate CSN 57ce61d8000200030000 in >> the changelog (DB rc=-30988). If replication stops, the consumer may need to >> be >> reinitialized. >> [06/Sep/2016:08:31:42 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57ce62c8000300010000) not found for DB_NEXT >> [06/Sep/2016:08:34:05 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57ce635a000100010000) not found for DB_NEXT >> [06/Sep/2016:08:44:28 +0200] clcache_load_buffer_bulk - changelog record with >> csn (57ce65c9000200030000) not found for DB_NEXT >> [06/Sep/2016:08:52:25 +0200] agmt="cn=Replication from ldap-adm.<domain> to >> ldap-ens.<domain>" (ldap-ens:636) - Can't locate CSN 57ce67aa000100030000 in >> the changelog (DB rc=-30988). If replication stops, the consumer may need to >> be >> reinitialized. >> [06/Sep/2016:08:53:04 +0200] - replica_generate_next_csn: >> opcsn=57ce67d1000100020000 <= basecsn=57ce67d1000200030000, adjusted >> opcsn=57ce67d1000200020000 >> These warnings are present on all three servers and for all replication >> agreements. One of them is virtual and two others are physical. >> The replication still seems to work fine in spite of these warnings. The >> "replica_generate_next_csn" is not new - it existed since always with 1.3.4, >> the two new warnings are "clcache_load_buffer_bulk " and "Can't locate CSN >> ... >> in the changelog (DB rc=-30988)." There are no network problems or anything >> like that. So it could only be replication topology (3-master fully-connected >> triangle) and/or servers being rather busy. Is it a bug, a warning that can >> be >> ignored or anything else? >> Thank you! >> -- >> 389-users mailing list 389-users@lists.fedoraproject.org >> https://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject.org > -- > Red Hat GmbH, http://www.de.redhat.com/ , Registered seat: Grasbrunn, > Commercial register: Amtsgericht Muenchen, HRB 153243, > Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric > Shander > -- > 389-users mailing list > 389-users@lists.fedoraproject.org > https://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject.org
-- 389-users mailing list 389-users@lists.fedoraproject.org https://lists.fedoraproject.org/admin/lists/389-users@lists.fedoraproject.org