[389-users] Replication : clcache_load_buffer - Can't locate CSN ... in the changelog (DB rc=-12797)

Ivanov Andrey (M.) via 389-users Wed, 14 Jan 2026 08:59:41 -0800

Hi,

I can confirm that we also have these alerts in our logs since migration (~670 
on one server, ~7456 on the other and ~35 on the third one during 24h 
yesterday, for example).


We have migrated our production (3 servers in triangle topology in 
active-active multimaster replication). It was at the same time version 
migration (2.7.0 to 3.1.4 commit 062aa6eab) and BDB -> LMDB migration with full 
initialization from scratch :
* export of the old data to LDIF
* "rm -fr" of all the data and code binary folders
* compile and install new version, configure with deployment scripts
* offline LDIF import with "dsctl $LDAP_SERVER_IDENTIFIER ldif2db userRoot"
* establish replication (LDAPS with a dedicated replication manager account 
login/pwd)
* initialization of the two remaining servers from the same initial node.

It's our standard upgrade procedure that has worked reliably since we have been 
using 389DS (we started in 2006 with FDS). Our first migration attempt in 
December failed due to repeated crashes during "MODRDN with new superior" 
operations (it was fixed several days ago in 
https://github.com/389ds/389-ds-base/issues/7108, thanks!). This time, the 
migration completed successfully and all indicators are green:
* "dsctl ... healthcheck" reports no issues
* "ds-replcheck  online:" shows no differences between servers, indicating that 
replication is functioning correctly.
* there have been no complaints from users or applications relying on LDAP 

So the error message does not seem to indicate any real problem. We observe the 
same error in our test environment as well, though in much smaller quantities. 
My impression is that this behavior is more likely related to the database 
backend (LMDB now versus BDB previously) than to the server version or any 
other configuration aspect. There are no other unexplained or significant 
errors in the LDAP error logs

Typical messages look like the following (replica ID 2 corresponds to 
ldap-second.example.com):
[14/Jan/2026:10:56:28.454005516 +0100] - ERR - agmt="cn=Replication from 
ldap-second.example.com to ldap-third.example.com" (ldap-third:636) - 
clcache_load_buffer - Can't locate CSN 6967684c001600020000 in the changelog 
(DB rc=-12797). If 
replication stops, the consumer may need to be reinitialized.

[14/Jan/2026:10:56:21.693741475 +0100] - ERR - agmt="cn=Replication from 
ldap-second.polytechnique.fr to ldap-first.polytechnique.fr" (ldap-first:636) - 
clcache_load_buffer - Can't locate CSN 69676845004300020000 in the changelog 
(DB rc=-12797). If replication stops, the consumer may need to be reinitialized.
 

I can open a ticket on github and perform the necessary debugging in our test 
environment if no explanation for this behavior has been identified so far.


Regards,
Andrey




----- Mail original -----
> De: "Julian Kippels via 389-users" <[email protected]>
> À: [email protected]
> Cc: "Julian Kippels" 
> Envoyé: Mardi 29 Juillet 2025 10:23:37
> Objet: [389-users] Re: [Extern] Re: Re: Error message "Can't locate CSN in 
> the changelog", allthough CSN is present in
> the changelog

> Just as a heads-up: I am also experiencing the same behaviour in version
> 3.1.2
> 
> Kind regards
> Julian
> 
> Am 11.07.25 um 13:09 schrieb Julian Kippels via 389-users:
>> Hi,
>> 
>> we are experiencing the exact same error on version 2.4.5.
>> 
>> [11/Jul/2025:12:05:59.340776929 +0200] - ERR - agmt="cn=replication-
>> agreement-ldap-consumer-1-test" (ldap-consumer-1-test:636) -
>> clcache_load_buffer - Can't locate CSN 6870e207000100010000 in the
>> changelog (DB rc=-12797). If replication stops, the consumer may need to
>> be reinitialized.
>> [11/Jul/2025:12:05:59.349374401 +0200] - ERR - agmt="cn=replication-
>> agreement-ldap-consumer-2-test" (ldap-consumer-2-test:636) -
>> clcache_load_buffer - Can't locate CSN 6870e207000200010000 in the
>> changelog (DB rc=-12797). If replication stops, the consumer may need to
>> be reinitialized.
>> 
>> I have seen the error msg "clcache_load_buffer - Can't load changelog
>> buffer starting at CSN ..." once, but that was when I was experimenting
>> with using nsds5ReplicaIgnoreMissingChange=once. But since this produced
>> even worse results than not having set that attribute I have reverted to
>> not having it set.
>> 
>> Unfortunately I am not able to run the dbscan command, because we are
>> using mdb and not bdb and "dbscan -D mdb ..." produces the error message
>> "Can't initialize db plugin: mdb"
>> 
>> As for reproducing: It seems that it happenes when a sufficiently large
>> volume of changes is made in a single connection. I have seen it with as
>> low as 40 operations per connection.
>> 
>> I hope this helps in debugging.
>> 
>> Kind regards
>> Julian Kippels
>> 
>> Am 11.07.25 um 11:47 schrieb Thierry Bordaz via 389-users:
>>> Hi,
>>>
>>> This is interesting finding and something we have not seen so far.
>>> Did you also see some logs like "lcache_load_buffer - Can't load
>>> changelog buffer starting at CSN...' ?
>>> It is logged when a replication agreement is preparing an iterator and
>>> can not locate the starting point (csn) in the replication changelog.
>>>
>>> Did you dump the changelog with dbscan ?
>>> For a given missing CSN (from logs), are you able to retrieve it with:
>>> dbscan -f /var/lib/dirsrv/slapd-instance/db//
>>> replication_changelog.db -k <csn>
>>>
>>> I was not able to reproduce on 2.6.1-6. Did you identify a
>>> reproducible testcase ? (size of the DB, number of update, How long
>>> did it happen after the topology was setup...)
>>>
>>>
>>> best regards
>>> thierry
>>>
>>> On 7/11/25 8:30 AM, Fl Sch via 389-users wrote:
>>>> Hello,
>>>>
>>>> we have recently upgraded our 389-ds setup to version 2.6.1 running
>>>> on AlmaLinux 9.6 (installed from the official AlmaLinux appstream
>>>> repo). Or upgrade approach was to build a completely new setup,
>>>> import all the data and afterwards switch the IP addresses of the old
>>>> and new servers.
>>>> Our setup consists of 2 suppliers (mdir01 + mdir02) and 2 consumers
>>>> (sdir01 + sdir02). Both suppliers each have a replication agreement
>>>> between each other, aswell as agreements to both consumers. Our
>>>> provisioning system is designed to only write changes to mdir01, it
>>>> just uses mdir02 in case it can't reach mdir01.
>>>> Our clients (DHCP servers) use all 4 directory servers.
>>>>
>>>> Since the upgrade we have that problem that we observe the following
>>>> messages in the error logs of both suppliers:
>>>> 
>>>> [10/Jul/2025:09:58:55.479913820 +0200] - ERR - agmt="cn=agreement-
>>>> mdir01-to-sdir02" (10:1389) - clcache_load_buffer - Can't locate CSN
>>>> 686f72bf00050f4b0000 in the changelog (DB rc=-12797). If replication
>>>> stops, the consumer may need to be reinitialized.
>>>> [10/Jul/2025:09:58:55.484629809 +0200] - ERR - agmt="cn=agreement-
>>>> mdir01-to-mdir02" (10:1389) - clcache_load_buffer - Can't locate CSN
>>>> 686f72bf00050f4b0000 in the changelog (DB rc=-12797). If replication
>>>> stops, the consumer may need to be reinitialized.
>>>> [10/Jul/2025:09:58:55.484868342 +0200] - ERR - agmt="cn=agreement-
>>>> mdir01-to-sdir01" (10:1389) - clcache_load_buffer - Can't locate CSN
>>>> 686f72bf00050f4b0000 in the changelog (DB rc=-12797). If replication
>>>> stops, the consumer may need to be reinitialized.
>>>> [10/Jul/2025:10:01:07.738009372 +0200] - ERR - agmt="cn=agreement-
>>>> mdir01-to-sdir02" (10:1389) - clcache_load_buffer - Can't locate CSN
>>>> 686f734300000f4b0000 in the changelog (DB rc=-12797). If replication
>>>> stops, the consumer may need to be reinitialized.
>>>> [10/Jul/2025:10:01:07.741023198 +0200] - ERR - agmt="cn=agreement-
>>>> mdir01-to-mdir02" (10:1389) - clcache_load_buffer - Can't locate CSN
>>>> 686f734300000f4b0000 in the changelog (DB rc=-12797). If replication
>>>> stops, the consumer may need to be reinitialized.
>>>>
>>>> [09/Jul/2025:12:18:14.429187195 +0200] - ERR - agmt="cn=agreement-
>>>> mdir02-to-sdir01" (10:1389) - clcache_load_buffer - Can't locate CSN
>>>> 686eb26600000f4b0000 in the changelog (DB rc=-12797). If replication
>>>> stops, the consumer may need to be reinitialized.
>>>> [09/Jul/2025:12:18:14.430628311 +0200] - ERR - agmt="cn=agreement-
>>>> mdir02-to-sdir02" (10:1389) - clcache_load_buffer - Can't locate CSN
>>>> 686eb26600000f4b0000 in the changelog (DB rc=-12797). If replication
>>>> stops, the consumer may need to be reinitialized.
>>>> [09/Jul/2025:12:18:16.909625172 +0200] - ERR - agmt="cn=agreement-
>>>> mdir02-to-sdir01" (10:1389) - clcache_load_buffer - Can't locate CSN
>>>> 686eb26800000f4b0000 in the changelog (DB rc=-12797). If replication
>>>> stops, the consumer may need to be reinitialized.
>>>> [09/Jul/2025:12:18:16.913147068 +0200] - ERR - agmt="cn=agreement-
>>>> mdir02-to-sdir02" (10:1389) - clcache_load_buffer - Can't locate CSN
>>>> 686eb26800000f4b0000 in the changelog (DB rc=-12797). If replication
>>>> stops, the consumer may need to be reinitialized.
>>>> [09/Jul/2025:12:42:30.255121122 +0200] - ERR - agmt="cn=agreement-
>>>> mdir02-to-sdir01" (10:1389) - clcache_load_buffer - Can't locate CSN
>>>> 686f60d500010f4b0000 in the changelog (DB rc=-12797). If replication
>>>> stops, the consumer may need to be reinitialized.
>>>> 
>>>>
>>>> On mdir01 these messages appear on average every 5 minutes during
>>>> peak hours.
>>>> On mdir02 much more infrequently, on average every 15-20 minutes.
>>>>
>>>> However, looking through the changelog I can find all the CSNs which
>>>> it apparently can't locate:
>>>> 
>>>> changetype: delete
>>>> replgen: 62b5bf320000010f0000
>>>> csn: 686f72bf00050f4b0000
>>>> nsuniqueid: da0de304-593411f0-aef0dc83-b57f4cfd
>>>> dn:
>>>> ClientIdentifier=00:00:00:00:d9:05,ou=dhcpldap,o=customer,dc=domain,dc=net
>>>>
>>>> changetype: delete
>>>> replgen: 62b5bf320000010f0000
>>>> csn: 686f734300000f4b0000
>>>> nsuniqueid: 4c315e08-5d6111f0-aef0dc83-b57f4cfd
>>>> dn:
>>>> ClientIdentifier=00:00:00:00:c6:f4,ou=dhcpldap,o=customer,dc=domain,dc=net
>>>>
>>>> changetype: delete
>>>> replgen: 62b5bf320000010f0000
>>>> csn: 686eb26600000f4b0000
>>>> nsuniqueid: 54f4e05b-04a311f0-9233a0e8-dc56aea6
>>>> dn:
>>>> ClientIdentifier=00:00:00:00:26:99,ou=dhcpldap,o=customer,dc=domain,dc=net
>>>>
>>>> changetype: delete
>>>> replgen: 62b5bf320000010f0000
>>>> csn: 686eb26800000f4b0000
>>>> nsuniqueid: 463ba84b-0af711f0-a597a0e8-dc56aea6
>>>> dn:
>>>> ClientIdentifier=00:00:00:00:f7:a0,ou=dhcpldap,o=customer,dc=domain,dc=net
>>>>
>>>> changetype: add
>>>> replgen: 62b5bf320000010f0000
>>>> csn: 686f60d500010f4b0000
>>>> nsuniqueid: e9d45f81-5d5811f0-aef0dc83-b57f4cfd
>>>> parentuniqueid: 6167af03-f3c311ec-862ceac3-35201d04
>>>> dn:
>>>> ClientIdentifier=00:00:00:00:16:68,ou=dhcpldap,o=customer,dc=domain,dc=net
>>>> change:: ...
>>>> 
>>>>
>>>> All the changes are populated to all directory servers in the
>>>> cluster. So there is no real problem visible.
>>>> In general, we have not seen any problems with replication
>>>> whatsoever, we just have these seemingly "false" messages in the
>>>> error log.
>>>>
>>>> Changelog trim is currently set to the following values:
>>>> 
>>>> nsslapd-changelogmaxage: 30d
>>>> nsslapd-changelogtrim-interval: 3600
>>>> 
>>>>
>>>> Does anybody know why these error messages appear? And if / how we
>>>> can get rid of them?
>>>> I just want to make sure that there is really no underlying issue
>>>> somewhere. And if those messages really falsely appear, I would like
>>>> to get rid of them if possible to avoid confusion and to stop
>>>> spamming the error logs.
>>>>
>>>>
>>>> Thank you very much in advance.
>>>
>> 
> 
> --
> _______________________________________________
> 389-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/[email protected]
> Do not reply to spam, report it:
> https://pagure.io/fedora-infrastructure/new_issue
-- 
_______________________________________________
389-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

[389-users] Replication : clcache_load_buffer - Can't locate CSN ... in the changelog (DB rc=-12797)

Reply via email to