Hi,
You are describing two different symptoms:
- replication gets desynchronized.
- replication initialization fails
We do not have much data about the first one. It could be due to lots of
reasons.
About the second one:
The bdb_bulk_import_queue returned 0 with entry ... are normal
The first error is:
[09/Dec/2024:11:32:30.390378254 -0300] - ERR - factory_destructor - ERROR
bulk import abandoned
Which happens on the server being initialized when the connection handling
the replication initialization gets unexpectedly closed.
That could usually be because:
[1] there was a problem on the supplier from which the replication get
initialized.
(Is there errors on that supplier error log or has it crashed ?)
[2] network is unreliable (that could also explain why replication get
desynchronized)
regards
Pierre
On Mon, Dec 9, 2024 at 5:20 PM Luiz Gustavo Quirino via 389-users <
[email protected]> wrote:
> I’m facing issues with replication in the following scenario:
>
> 3 Linux nodes (Rocky) running version 2.4.5 B2024.198.0000 of 389.
> Replication is configured in a ring topology:
> node01 -> node02 -> node03 -> node01.
> Password changes are made via the PWM-Project web interface.
>
> Problem:
> At some point, the synchronization between nodes is lost.
> When I attempt to restart replication, the node being updated crashes the
> database.
> For example, when initializing replication from node01 to node02, the
> following error occurs:
> ---------
> [09/Dec/2024:11:32:30.382466035 -0300] - DEBUG - bdb_ldbm_back_wire_import
> - bdb_bulk_import_queue returned 0 with entry
> uid=app.tzv.w,OU=APLICACOES,dc=colorado,dc=local
> [09/Dec/2024:11:32:30.387198997 -0300] - DEBUG - bdb_ldbm_back_wire_import
> - bdb_bulk_import_queue returned 0 with entry
> uid=app.poc.w,OU=APLICACOES,dc=colorado,dc=local
> [09/Dec/2024:11:32:30.390378254 -0300] - ERR - factory_destructor - ERROR
> bulk import abandoned
> [09/Dec/2024:11:32:30.557600717 -0300] - ERR - bdb_import_run_pass -
> import userroot: Thread monitoring returned: -23
>
> [09/Dec/2024:11:32:30.559453847 -0300] - ERR - bdb_public_bdb_import_main
> - import userroot: Aborting all Import threads...
> [09/Dec/2024:11:32:36.468531612 -0300] - ERR - bdb_public_bdb_import_main
> - import userroot: Import threads aborted.
> [09/Dec/2024:11:32:36.470641812 -0300] - INFO - bdb_public_bdb_import_main
> - import userroot: Closing files...
> [09/Dec/2024:11:32:36.553007637 -0300] - ERR - bdb_public_bdb_import_main
> - import userroot: Import failed.
> [09/Dec/2024:11:32:36.574692177 -0300] - DEBUG - NSMMReplicationPlugin -
> consumer_connection_extension_destructor - Aborting total update in
> progress for replicated area dc=colorado,dc=local connid=7019159
> [09/Dec/2024:11:32:36.577255941 -0300] - ERR - process_bulk_import_op -
> NULL target sdn
> [09/Dec/2024:11:32:36.579573401 -0300] - DEBUG - NSMMReplicationPlugin -
> replica_relinquish_exclusive_access - conn=7019159 op=-1
> repl="dc=colorado,dc=local": Released replica held by
> locking_purl=conn=7019159 id=3
> [09/Dec/2024:11:32:36.600514849 -0300] - ERR - pw_get_admin_users - Search
> failed for
> cn=GRP_SRV_PREHASHED_PASSWORD,ou=389,OU=GRUPOS,ou=colorado,dc=colorado,dc=local:
> error 10 - Password Policy Administrators can not be set
> [09/Dec/2024:11:32:36.757883417 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop - decoding payload...
> [09/Dec/2024:11:32:36.760105387 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop - decoded protocol_oid: 2.16.840.1.113730.3.6.1
> [09/Dec/2024:11:32:36.762467539 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop - decoded repl_root: dc=colorado,dc=local
> [09/Dec/2024:11:32:36.765113155 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop - decoded csn: 6756ff84000001910000
> [09/Dec/2024:11:32:36.767727935 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop: RUV:
> [09/Dec/2024:11:32:36.769205061 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop: {replicageneration} 6748f91f000001910000
> [09/Dec/2024:11:32:36.770721824 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop: {replica 401 ldap://node01.ldap.colorado.br:389}
> 6748f921000001910000 6756ff6f000101910000 00000000
> [09/Dec/2024:11:32:36.772753378 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop: {replica 403 ldap://node03-ldap:389}
> 6748f9db000101930000 6756ff79000001930000 00000000
> [09/Dec/2024:11:32:36.774289526 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop: {replica 402 ldap://node02.ldap.colorado.br:389}
> 6748f996000101920000 6756ff34000001920000 00000000
> [09/Dec/2024:11:32:36.775750926 -0300] - DEBUG - NSMMReplicationPlugin -
> decode_startrepl_extop - Finshed decoding payload.
> [09/Dec/2024:11:32:36.777404849 -0300] - DEBUG - NSMMReplicationPlugin -
> consumer_connection_extension_acquire_exclusive_access - conn=7019230 op=4
> Acquired consumer connection extension
> [09/Dec/2024:11:32:36.779856975 -0300] - DEBUG - NSMMReplicationPlugin -
> multisupplier_extop_StartNSDS50ReplicationRequest - conn=7019230 op=4
> repl="dc=colorado,dc=local": Begin incremental protocol
> [09/Dec/2024:11:32:36.781999075 -0300] - DEBUG - _csngen_adjust_local_time
> - gen state before 6756ff7b0001:1733754747:0:0
> [09/Dec/2024:11:32:36.784626039 -0300] - DEBUG - _csngen_adjust_local_time
> - gen state after 6756ff840000:1733754756:0:0
> [09/Dec/2024:11:32:36.786708353 -0300] - DEBUG - csngen_adjust_time - gen
> state before 6756ff840000:1733754756:0:0
> [09/Dec/2024:11:32:36.788232997 -0300] - DEBUG - csngen_adjust_time - gen
> state after 6756ff840001:1733754756:0:0
> [09/Dec/2024:11:32:36.790217310 -0300] - DEBUG - NSMMReplicationPlugin -
> replica_get_exclusive_access - conn=7019230 op=4
> repl="dc=colorado,dc=local": Acquired replica
> ----------
> To restore synchronization, I need to delete all replication
> configurations and recreate them. However, the issue reappears after some
> time.
>
> I’d appreciate any suggestions on how to identify and resolve this issue
> permanently.
>
> Thks.
> --
> _______________________________________________
> 389-users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> Fedora Code of Conduct:
> https://docs.fedoraproject.org/en-US/project/code-of-conduct/
> List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
> List Archives:
> https://lists.fedoraproject.org/archives/list/[email protected]
> Do not reply to spam, report it:
> https://pagure.io/fedora-infrastructure/new_issue
>
--
--
389 Directory Server Development Team
--
_______________________________________________
389-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/[email protected]
Do not reply to spam, report it:
https://pagure.io/fedora-infrastructure/new_issue