On 04/13/2012 01:03 PM, Dan Scott wrote:
Thanks for the quick response.Simo: Thanks - I'd prefer to clean it up properly rather than start from scratch. I haven't changed the LDAP schema at all. All I've done is the use the IPA tools for user admin and add/remove replicas. I just felt like I've been emailing this list once a week or so for the past few months - I was beginning to think that it was beyond repair! :) On Fri, Apr 13, 2012 at 14:38, Rich Megginson<rmegg...@redhat.com> wrote:On 04/13/2012 12:22 PM, Dan Scott wrote:On Fri, Apr 13, 2012 at 13:43, Rich Megginson<rmegg...@redhat.com> wrote:On 04/13/2012 11:39 AM, Dan Scott wrote:I'm convinced that my LDAP directories contain lots of cruft which has built up and is causing problems on my system. There may even be some corruption since there's an entry which I'm unable to remove - this entry does not get replicated to the other servers.What version of 389-ds-base is this? Do you get any errors? It just silently fails to delete this particular entry?[root@fileserver1 ~]# rpm -qa|grep 389 389-ds-base-libs-1.2.10.4-2.fc16.x86_64 389-ds-base-1.2.10.4-2.fc16.x86_64 [root@fileserver1 ~]#ldapmodify -f rmfileserver5.ldif -D 'cn=directory manager' -W Enter LDAP Password: deleting entry "cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu" ldap_delete: Operation not allowed on non-leaf (66) [root@fileserver1 ~]# ldapsearch -D 'cn=directory manager' -W -v -b 'cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu' '(objectclass=*)' ldap_initialize(<DEFAULT> ) Enter LDAP Password: filter: (objectclass=*) requesting: All userApplication attributes # extended LDIF # # LDAPv3 # base<cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu> with scope subtree # filter: (objectclass=*) # requesting: ALL # # fileserver5.ecg.mit.edu, masters, ipa, etc, ecg.mit.edu dn: cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu cn: fileserver5.ecg.mit.edu objectClass: top objectClass: nsContainer # search result search: 2 result: 0 Success # numResponses: 2 # numEntries: 1 [root@fileserver1 ~]# If I'm interpreting this correctly, it can't be deleted because it's not a leaf node, but it doesn't have any sub-entries that I can delete first.You are correct. Try this: ldapsearch -D 'cn=directory manager' -W -v -b 'cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu' '(|(objectclass=nstombstone)(objectclass=*))'Ahh, so there are some 'child' entries: [root@fileserver1 ~]# ldapsearch -D 'cn=directory manager' -W -b 'cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu' '(|(objectclass=nstombstone)(objectclass=*))' Enter LDAP Password: # extended LDIF # # LDAPv3 # base<cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu> with scope subtree # filter: (|(objectclass=nstombstone)(objectclass=*)) # requesting: ALL # # aaa2c704-63cf11e1-ac8dadbd-35182efb, fileserver5.ecg.mit.edu, masters, ipa, etc, ecg.mit.edu dn: nsuniqueid=aaa2c704-63cf11e1-ac8dadbd-35182efb,cn=fileserver5.ecg.mit.edu, cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu objectClass: top objectClass: nsContainer objectClass: nsTombstone cn: fileserver5.ecg.mit.edu nsParentUniqueId: 4fff591e-e48611e0-bf3681aa-d1a3957d # 17708e04-63dd11e1-9b079095-05c635b0, fileserver5.ecg.mit.edu, masters, ipa, etc, ecg.mit.edu dn: nsuniqueid=17708e04-63dd11e1-9b079095-05c635b0,cn=fileserver5.ecg.mit.edu, cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu objectClass: top objectClass: nsContainer objectClass: nsTombstone cn: fileserver5.ecg.mit.edu nsParentUniqueId: 4fff591e-e48611e0-bf3681aa-d1a3957d # 5ceb8604-63f211e1-bc108552-1fbf39e2, fileserver5.ecg.mit.edu, masters, ipa, etc, ecg.mit.edu dn: nsuniqueid=5ceb8604-63f211e1-bc108552-1fbf39e2,cn=fileserver5.ecg.mit.edu, cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu objectClass: top objectClass: nsContainer objectClass: nsTombstone cn: fileserver5.ecg.mit.edu nsParentUniqueId: 4fff591e-e48611e0-bf3681aa-d1a3957d # fileserver5.ecg.mit.edu, masters, ipa, etc, ecg.mit.edu dn: cn=fileserver5.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu cn: fileserver5.ecg.mit.edu objectClass: top objectClass: nsContainer # c480f184-83f011e1-90d1df13-bba55eff, HTTP, fileserver5.ecg.mit.edu, masters , ipa, etc, ecg.mit.edu dn: nsuniqueid=c480f184-83f011e1-90d1df13-bba55eff,cn=HTTP,cn=fileserver5.ecg. mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu objectClass: nsContainer objectClass: ipaConfigObject objectClass: top objectClass: nsTombstone ipaConfigString: enabledService ipaConfigString: startOrder 40 cn: HTTP nsParentUniqueId: 1eba8a03-642311e1-9b95afe9-fc1b53ef # search result search: 2 result: 0 Success # numResponses: 6 # numEntries: 5 Is it safe to delete them?
Yes.
I also see inconsistent replication states on the servers. i.e. server1 shows that it's replicating with server2 but server2 does not show that it's replicating with server1.Do you have errors in the server2 log showing that it is attempting to replicate with server1 but failing with some error?[root@fileserver1 ~]# ipa-csreplica-manage list -v fileserver1.ecg.mit.edu Directory Manager password: fileserver2.ecg.mit.edu last init status: None last init ended: None last update status: 0 Replica acquired successfully: Incremental update succeeded last update ended: 2012-04-13 17:57:39+00:00 [root@fileserver1 ~]# ipa-csreplica-manage list -v fileserver2.ecg.mit.edu Directory Manager password: fileserver1.ecg.mit.edu last init status: None last init ended: None last update status: 0 Replica acquired successfully: Incremental update succeeded last update ended: 2012-04-13 17:57:41+00:00 fileserver3.ecg.mit.edu last init status: None last init ended: None last update status: 0 Replica acquired successfully: Incremental update succeeded last update ended: 2012-04-13 17:57:41+00:00 [root@fileserver1 ~]# ipa-csreplica-manage list -v fileserver3.ecg.mit.edu Directory Manager password: fileserver2.ecg.mit.edu last init status: None last init ended: None last update status: 0 Replica acquired successfully: Incremental update succeeded last update ended: 2012-04-13 17:57:44+00:00 fileserver1.ecg.mit.edu last init status: None last init ended: None last update status: 0 Replica acquired successfully: Incremental update succeeded last update ended: 2012-04-13 17:57:43+00:00 [root@fileserver1 ~]# fileserver1's (and fileserver2s) /var/log/dirsrv/slapd-PKI-IPA/errors contains lots of: [13/Apr/2012:13:57:43 -0400] NSMMReplicationPlugin - repl_set_mtn_referrals: could not set referrals for replica o=ipaca: 20This error usually means a replica was deleted and the RUV needs to be cleaned. see http://port389.org/wiki/Howto:CLEANRUV and https://fedorahosted.org/freeipa/ticket/2303 and https://fedorahosted.org/389/ticket/337OK, I've seen this before - is it important to remove them? I've had to add and remove replicas so much that I don't really want to do it unless it's necessary. I'm happy to live with them if it's not a problem.
It's not a problem until it's a problem :-) I would go ahead and run CLEANRUV.
fileserver3's /var/log/dirsrv/slapd-PKI-IPA/errors contains lots of: [13/Apr/2012:13:52:50 -0400] slapi_ldap_bind - Error: could not send startTLS request: error -1 (Can't contact LDAP server) errno 107 (Transport endpoint is not connected)This is a real connection error - could be cert or hostname lookup related.How do I find out if it's cert or hostname lookup? Which hostname? Fileserver3 runs DNS, and it seems to be working fine.
Try ldapsearch - on server3LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-PKI-IPA ldapsearch -x -ZZ -H ldap://server2.fqdn -D "cn=directory manager" -W -s base -b ""
If that works, check to make sure the replication agreement has the correct server2.fqdn
If that doesn't work, use ldapsearch -d 1 -x ..... to get further debugging information.
[13/Apr/2012:13:57:39 -0400] NSMMReplicationPlugin - repl_set_mtn_referrals: could not set referrals for replica o=ipaca: 20 fileserver2's non-PKI replication agreements to both fileserver1 and 3 are in place, but both say: Incremental update has failed and requires administrator actionSystem error.When I try to re-initialize: [root@fileserver2 ~]# ipa-replica-manage re-initialize --from fileserver3.ecg.mit.edu Directory Manager password: [fileserver3.ecg.mit.edu] reports: Replica Busy! Status: [1 Replication error acquiring replica: replica busy]This is a transient condition.Fileserver2 is busy?
Yes.
The /var/log/dirsrv/slapd-ECG-MIT-EDU/errors is now full of: [13/Apr/2012:14:59:19 -0400] NSMMReplicationPlugin - conn=1 op=571 csn=4f70a9e5000100060000: Can't created glue entry cn=fileserver4.ecg.mit.edu,cn=masters,cn=ipa,cn=etc,dc=ecg,dc=mit,dc=edu uniqueid=6949d104-775b11e1-abce82a1-a45dd3c3, error 68 Should I delete the LDAP entry which is trying to replicate fileserver2 with fileserver4?
Yes. And it may be due to the fact that the entry it is trying to delete has those tombstone children that have to be deleted too.
this command has been running for 1/2hr and produced no more output (fileserver2 is the remaining server running Fedora 15, the others are Fedora 16 with latest updates).Not sure how ipa-replica-manage handles busy - does it keep trying until it is not busy?Is there some way that I can refresh/clean my LDAP directories and ensure that everything's running correctly.We first need to find out what's going on and why you are seeing these failures before we can recommend a particular course of action. There is currently no "find all of my problems and fix them" command.:) Wish there was. It's just that I've been having lots of problems recently and I was thinking that there is something fundamentally wrong with my installation. I keep having to ask you guys for help.I think some of these problems were due to the fact that an alpha version of 389 got pushed to the Stable repo in F-16, and in between that alpha version and the real "Stable" version we were forced to change the database format to fix a serious issue, and that introduced some inconsistencies into the database upon upgrade.Yeah, I think most of my troubles have started since that version. Hope I can get it fixed! :)An additional problem, which Rob Crittenden is helping with is that I'm trying to install another replica (fileserver4) which fails when setting up the CA: 2012-04-11 11:30:47,289 CRITICAL failed to configure ca instance Command '/usr/bin/perl /usr/bin/pkisilent 'ConfigureCA' '-cs_hostname' 'fileserver4.ecg.mit.edu' '-cs_port' '9445' '-client_certdb_dir' '/tmp/tmp-JJIkrk' '-client_certdb_pwd' XXXXXXXX '-preop_pin' 'LI1En8UwjZ2BYDcnu8nJ' '-domain_name' 'IPA' '-admin_user' 'admin' '-admin_email' 'root@localhost' '-admin_password' XXXXXXXX '-agent_name' 'ipa-ca-agent' '-agent_key_size' '2048' '-agent_key_type' 'rsa' '-agent_cert_subject' 'CN=ipa-ca-agent,O=ECG.MIT.EDU' '-ldap_host' 'fileserver4.ecg.mit.edu' '-ldap_port' '7389' '-bind_dn' 'cn=Directory Manager' '-bind_password' XXXXXXXX '-base_dn' 'o=ipaca' '-db_name' 'ipaca' '-key_size' '2048' '-key_type' 'rsa' '-key_algorithm' 'SHA256withRSA' '-save_p12' 'true' '-backup_pwd' XXXXXXXX '-subsystem_name' 'pki-cad' '-token_name' 'internal' '-ca_subsystem_cert_subject_name' 'CN=CA Subsystem,O=ECG.MIT.EDU' '-ca_ocsp_cert_subject_name' 'CN=OCSP Subsystem,O=ECG.MIT.EDU' '-ca_server_cert_subject_name' 'CN=fileserver4.ecg.mit.edu,O=ECG.MIT.EDU' '-ca_audit_signing_cert_subject_name' 'CN=CA Audit,O=ECG.MIT.EDU' '-ca_sign_cert_subject_name' 'CN=Certificate Authority,O=ECG.MIT.EDU' '-external' 'false' '-clone' 'true' '-clone_p12_file' 'ca.p12' '-clone_p12_password' XXXXXXXX '-sd_hostname' 'fileserver3.ecg.mit.edu' '-sd_admin_port' '443' '-sd_admin_name' 'admin' '-sd_admin_password' XXXXXXXX '-clone_start_tls' 'true' '-clone_uri' 'https://fileserver3.ecg.mit.edu:443'' returned non-zero exit status 255 Sorry to dump a tonne of problems in one go, but you can see why I think there's something (probably several things) badly wrong with my installation. I guess I was looking for a few very basic things to check to ensure that the servers are fundamentally configured properly.Unfortunately, it appears that some of your problems are unexpected and/or have not been seen before.Hopefully I can fix them, as long as you don't mind my endless emails to the list.... :)
At some point, you may run into diminishing returns trying to fix your current broken installation - that is, the time spent playing whack-a-mole with these problems might be better spent starting over from scratch . . .
Thanks, Dan
_______________________________________________ Freeipa-users mailing list Freeipa-users@redhat.com https://www.redhat.com/mailman/listinfo/freeipa-users