Ok I know what is going now...
We get this error when we try to end a replication session while the
start of a replication session request is still in progress.
Replica access log (conn=7 is the supplier):
[05/Mar/2025:21:26:45.651516142 -0500] conn=7 op=19 EXT
oid="2.16.840.1.113730.3.5.12" name="replication-multisupplier-extop"
--> START SESSION
[05/Mar/2025:21:26:45.666796859 -0500] conn=7 op=20 EXT
oid="2.16.840.1.113730.3.5.5" name="replication-multisupplier-extop" -->
END SESSION
[05/Mar/2025:21:26:45.694423788 -0500] conn=7 op=20 RESULT *err=2
*tag=120 nentries=0 wtime=0.000123842 optime=0.027627701
etime=0.027750453 --> END SESSION RESULT
---> This "*err=2*" is what triggers this decoding error on the
supplier: [05/Mar/2025:21:26:45.706707732 -0500] - ERR -
decode_repl_ext_response - decoding failed: response_code (Ok)
ruv_bervals (Ok) data_guid (Ok) data (Ok) bvdata (No data)
[05/Mar/2025:21:26:45.664625578 -0500] conn=7 op=19 RESULT err=0 tag=120
nentries=0 wtime=0.000175835 optime=0.013111958 etime=0.013287409 -->
START SESSION RESULT
So supplier 1 tries to start a replication session, but before the
replica can send the response to that "start" request, the same supplier
tries to "end" it. This is what triggers that decoding error. It still
appears to be harmless, but there is some race condition going on here.
I'm going to open a new ticket to investigate it more thoroughly.
Mark
On 3/5/25 4:45 PM, Orion Poplawski wrote:
Mark,
I don't see that string in any of the /var/log/dirsrv/slapd-*/errors* files on
our IPA servers.
Thanks,
Orion
On 3/5/25 14:38, Mark Reynolds wrote:
Orion,
Can you please check all your replicas to if see any error messages contain
"encoding failed" ?
That is the only case I can find where the bvdata is NULL during a replica
release extended operation.
Thanks,
Mark
On 2/25/25 12:38 PM, Orion Poplawski wrote:
No sign of replication being broken, but we do get this reasonably regularly.
Would it be worth reporting anywhere or any other info the might be useful?
Thanks for the resonse,
Orion
On 2/3/25 13:27, Mark Reynolds wrote:
Hi Orion,
These two messages just mean that the replication session did not close/end
correctly. It is harmless on its own - unless replication is broken. We have
been seeing these errors recently when a replica is being released, but no way
to reproduce it yet.
Not sure about the content-sync-plugin warning as I'm not familiar with that
plugin.
Mark
On 2/3/25 10:44 AM, Orion Poplawski via 389-users wrote:
[02/Feb/2025:20:57:56.553507369 -0700] - ERR - decode_repl_ext_response -
decoding failed: response_code (Ok) ruv_bervals (Ok) data_guid (Ok) data
(Ok) bvdata (No data)
[02/Feb/2025:20:57:56.555181048 -0700] - ERR - NSMMReplicationPlugin -
release_replica - agmt="cn=IPA2-IPA1" (IPA1:389): Unable to parse the
response to the endReplication extended operation.
--
Identity Management Development Team
--
_______________________________________________
389-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam, report it:
https://pagure.io/fedora-infrastructure/new_issue