Just in case anybody is interested in the reason the corruption occurred. Apparently a rotten browsing index caused it. An error message pointed me in this direction : errors:[17/Jun/2010:12:51:18 +0200] - vlv_build_idl: can't follow db cursor (err -30989) I deleted the browsing index from the particular ou and the problem was gone.
On Wed, Jun 16, 2010 at 9:39 PM, Rich Megginson <rmegg...@redhat.com> wrote: > mark benschop wrote: > > Hi Rich, > > Thanks for your reply. > > Please find the logging from the problems below. > > The serverb55 is one of 2 servers in a multiple masters configuration > > that consists of serverb55 and serverb05. > > > > The problem I inititially had was that I had 2 entries that could not > > be deleted serverb55. > > > > Here's logging from the access file. > > ======================================================================= > > access.20100614-092820:[15/Jun/2010:09:20:49 +0200] conn=342177 op=7 > > SRCH base="uid=dbeijk, ou=people, dc=directory,dc=intern" scope=0 > > filter="(objectClass=*)" attrs=ALL > > access.20100614-092820:[15/Jun/2010:09:20:49 +0200] conn=342177 op=7 > > RESULT err=0 tag=101 nentries=1 etime=0 > > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=8 > > SRCH base="uid=dbeijk, ou=people, dc=directory,dc=intern" scope=1 > > filter="(objectClass=*)" attrs="objectClass" > > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=8 > > RESULT err=0 tag=101 nentries=0 etime=0 notes=U > > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=9 > > DEL dn="uid=dbeijk, ou=people, dc=directory,dc=intern" > > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=9 > > RESULT err=1 tag=107 nentries=0 etime=0 csn=4c172a21000000370000 > > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=10 > > SRCH base="uid=dbeijk, ou=people, dc=directory,dc=intern" scope=1 > > filter="(objectClass=*)" attrs="objectClass" > > access.20100614-092820:[15/Jun/2010:09:22:08 +0200] conn=342177 op=10 > > RESULT err=0 tag=101 nentries=0 etime=0 notes=U > > ======================================================================= > > > > LDAP error 1 i found means 'unwiling to perform'. First I thought > > something might be wrong with the entry itself. > > The error log found in the error log from the serverb55 I've added > > below seemed to point in that direction. > > > > > > When I logged on the the other ldapserver, serverb05, I tried to > > delete the same entry to see if this slapd had the same issue but here > > it worked. > > Replicating the delete didn't. The following error was logged to the > > errorlog of this : > > > > ======================================================================== > > [15/Jun/2010:09:35:17 +0200] NSMMReplicationPlugin - > > agmt="cn=serverb55" (serverb55:636): Consumer failed to replay change > > (uniqueid a276337c-5dc511df-852cfef8-667fa4d4, CSN > > 4c172d36000000050000): Operations error. Will retry later. > > ======================================================================= > > > > So there seemed to be a problem with the serverb55 only. > > Since I assumed the database got somehow corrupt or inconsistent I've > > tried the following steps to try and recreate the database or had it > > checked in order to get it right again. > > First there's the errors from the account that could not be deleted. > > I 'reinitialised the consumer' from the working serverb05 to the > > problematic serverb55. > > Then I restarted the slapd. > > Made an export of the database and imported that. > > Slapd stopped the database. > > > > Please find the logging from /var/log/dirsrv/slapd-serverb55/errors > > from the actions leading to the problem of the fatal server stop. > > ====================================================================== > > CentOS-Directory/8.1.0 > > B2009.134.1334 > > > > serverb55:636 > > (/etc/dirsrv/slapd-serverb55) > > > > > > > > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" missing attribute "uidNumber" required by > > object class "posixAccount" > > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" missing attribute "gidNumber" required by > > object class "posixAccount" > > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" -- attribute "uidNumber" not allowed > > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" missing attribute "uid" required by object > > class "posixAccount" > > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" missing attribute "cn" required by object > > class "posixAccount" > > [15/Jun/2010:09:22:58 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" missing attribute "homeDirectory" required by > > object class "posixAccount" > > [15/Jun/2010:09:23:04 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" missing attribute "uidNumber" required by > > object class "posixAccount" > > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" missing attribute "uidNumber" required by > > object class "posixAccount" > > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" missing attribute "gidNumber" required by > > object class "posixAccount" > > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" required attribute "objectclass" missing > > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" required attribute "objectclass" missing > > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" missing attribute "uid" required by object > > class "posixAccount" > > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" missing attribute "cn" required by object > > class "posixAccount" > > [15/Jun/2010:09:23:18 +0200] - Entry "uid=dbeijk, ou=People, > > dc=directory,dc=intern" missing attribute "homeDirectory" required by > > object class "posixAccount" > > [15/Jun/2010:09:24:56 +0200] - Entry "uid=DEL *.*, ou=People, > > dc=directory,dc=intern" missing attribute "homeDirectory" required by > > object class "posixAccount" > > [15/Jun/2010:09:50:20 +0200] - Entry "cn=wchiman, ou=people, > > dc=directory,dc=intern" -- attribute "uidNumber" not allowed > > [15/Jun/2010:10:12:43 +0200] NSMMReplicationPlugin - > > multimaster_be_state_change: replica dc=directory,dc=intern is going > > offline; disabling replication > > [15/Jun/2010:10:12:43 +0200] - attrcrypt_unwrap_key: failed to unwrap > > key for cipher AES > > [15/Jun/2010:10:12:43 +0200] - Failed to retrieve key for cipher AES > > in attrcrypt_cipher_init > > [15/Jun/2010:10:12:43 +0200] - Failed to initialize cipher AES in > > attrcrypt_init > > [15/Jun/2010:10:12:43 +0200] - WARNING: Import is running with > > nsslapd-db-private-import-mem on; No other process is allowed to > > access the database > > [15/Jun/2010:10:12:48 +0200] - import userRoot: Workers finished; > > cleaning up... > > [15/Jun/2010:10:12:49 +0200] - import userRoot: Workers cleaned up. > > [15/Jun/2010:10:12:49 +0200] - import userRoot: Indexing complete. > > Post-processing... > > [15/Jun/2010:10:12:49 +0200] - import userRoot: Flushing caches... > > [15/Jun/2010:10:12:49 +0200] - import userRoot: Closing files... > > [15/Jun/2010:10:12:49 +0200] - import userRoot: Import complete. > > Processed 4849 entries in 6 seconds. (808.17 entries/sec) > > [15/Jun/2010:10:12:49 +0200] - attrcrypt_unwrap_key: failed to unwrap > > key for cipher AES > > [15/Jun/2010:10:12:49 +0200] - Failed to retrieve key for cipher AES > > in attrcrypt_cipher_init > > [15/Jun/2010:10:12:49 +0200] - Failed to initialize cipher AES in > > attrcrypt_init > > [15/Jun/2010:10:12:49 +0200] NSMMReplicationPlugin - > > multimaster_be_state_change: replica dc=directory,dc=intern is coming > > online; enabling replication > > [15/Jun/2010:10:12:49 +0200] NSMMReplicationPlugin - > > replica_reload_ruv: Warning: new data for replica > > dc=directory,dc=intern does not match the data in the changelog. > > Recreating the changelog file. This could affect replication with > > replica's consumers in which case the consumers should be reinitialized. > > [15/Jun/2010:10:12:49 +0200] - skipping cos definition > > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > > [15/Jun/2010:10:55:43 +0200] NSMMReplicationPlugin - > > multimaster_be_state_change: replica dc=directory,dc=intern is going > > offline; disabling replication > > [15/Jun/2010:10:55:43 +0200] - attrcrypt_unwrap_key: failed to unwrap > > key for cipher AES > > [15/Jun/2010:10:55:43 +0200] - Failed to retrieve key for cipher AES > > in attrcrypt_cipher_init > > [15/Jun/2010:10:55:43 +0200] - Failed to initialize cipher AES in > > attrcrypt_init > > [15/Jun/2010:10:55:43 +0200] - WARNING: Import is running with > > nsslapd-db-private-import-mem on; No other process is allowed to > > access the database > > [15/Jun/2010:10:55:49 +0200] - import userRoot: Workers finished; > > cleaning up... > > [15/Jun/2010:10:55:49 +0200] - import userRoot: Workers cleaned up. > > [15/Jun/2010:10:55:49 +0200] - import userRoot: Indexing complete. > > Post-processing... > > [15/Jun/2010:10:55:49 +0200] - import userRoot: Flushing caches... > > [15/Jun/2010:10:55:49 +0200] - import userRoot: Closing files... > > [15/Jun/2010:10:55:49 +0200] - import userRoot: Import complete. > > Processed 4850 entries in 5 seconds. (970.00 entries/sec) > > [15/Jun/2010:10:55:49 +0200] - attrcrypt_unwrap_key: failed to unwrap > > key for cipher AES > > [15/Jun/2010:10:55:49 +0200] - Failed to retrieve key for cipher AES > > in attrcrypt_cipher_init > > [15/Jun/2010:10:55:49 +0200] - Failed to initialize cipher AES in > > attrcrypt_init > > [15/Jun/2010:10:55:49 +0200] NSMMReplicationPlugin - > > multimaster_be_state_change: replica dc=directory,dc=intern is coming > > online; enabling replication > > [15/Jun/2010:10:55:49 +0200] NSMMReplicationPlugin - > > replica_reload_ruv: Warning: new data for replica > > dc=directory,dc=intern does not match the data in the changelog. > > Recreating the changelog file. This could affect replication with > > replica's consumers in which case the consumers should be reinitialized. > > [15/Jun/2010:10:55:49 +0200] - skipping cos definition > > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > > [15/Jun/2010:10:59:57 +0200] - slapd shutting down - signaling > > operation threads > > [15/Jun/2010:10:59:57 +0200] - slapd shutting down - waiting for 26 > > threads to terminate > > [15/Jun/2010:10:59:57 +0200] - slapd shutting down - closing down > > internal subsystems and plugins > > [15/Jun/2010:10:59:58 +0200] - Waiting for 4 database threads to stop > > [15/Jun/2010:10:59:59 +0200] - All database threads now stopped > > [15/Jun/2010:10:59:59 +0200] - slapd stopped. > > CentOS-Directory/8.1.0 B2009.134.1334 > > <host>:<port> (/etc/dirsrv/slapd-serverb55) > > > > [15/Jun/2010:11:00:01 +0200] - Entry "cn=schema" single-valued > > attribute "modifyTimestamp" has multiple values > > CentOS-Directory/8.1.0 B2009.134.1334 > > serverb55:636 (/etc/dirsrv/slapd-serverb55) > > > > [15/Jun/2010:11:00:01 +0200] - CentOS-Directory/8.1.0 B2009.134.1334 > > starting up > > [15/Jun/2010:11:00:01 +0200] - I'm resizing my cache now...cache was > > 20000000 and is now 8000000 > > [15/Jun/2010:11:00:01 +0200] - attrcrypt_unwrap_key: failed to unwrap > > key for cipher AES > > [15/Jun/2010:11:00:01 +0200] - Failed to retrieve key for cipher AES > > in attrcrypt_cipher_init > > [15/Jun/2010:11:00:01 +0200] - Failed to initialize cipher AES in > > attrcrypt_init > > [15/Jun/2010:11:00:01 +0200] - attrcrypt_unwrap_key: failed to unwrap > > key for cipher AES > > [15/Jun/2010:11:00:01 +0200] - Failed to retrieve key for cipher AES > > in attrcrypt_cipher_init > > [15/Jun/2010:11:00:01 +0200] - Failed to initialize cipher AES in > > attrcrypt_init > > [15/Jun/2010:11:00:01 +0200] - skipping cos definition > > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > > [15/Jun/2010:11:00:01 +0200] NSMMReplicationPlugin - > > replica_check_for_data_reload: Warning: data for replica > > dc=directory,dc=intern was reloaded and it no longer matches the data > > in the changelog (replica data > changelog). Recreating the changelog > > file. This could affect replication with replica's consumers in which > > case the consumers should be reinitialized. > > [15/Jun/2010:11:00:01 +0200] - skipping cos definition > > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > > [15/Jun/2010:11:00:01 +0200] - slapd started. Listening on All > > Interfaces port 389 for LDAP requests > > [15/Jun/2010:11:00:01 +0200] - Listening on All Interfaces port 636 > > for LDAPS requests > > [15/Jun/2010:11:29:59 +0200] - slapd shutting down - signaling > > operation threads > > [15/Jun/2010:11:29:59 +0200] - slapd shutting down - closing down > > internal subsystems and plugins > > [15/Jun/2010:11:30:00 +0200] - Waiting for 4 database threads to stop > > [15/Jun/2010:11:30:00 +0200] - All database threads now stopped > > [15/Jun/2010:11:30:00 +0200] - slapd stopped. > > CentOS-Directory/8.1.0 B2009.134.1334 > > <host>:<port> (/etc/dirsrv/slapd-serverb55) > > > > [15/Jun/2010:11:30:03 +0200] - Entry "cn=schema" single-valued > > attribute "modifyTimestamp" has multiple values > > CentOS-Directory/8.1.0 B2009.134.1334 > > serverb55:636 (/etc/dirsrv/slapd-serverb55) > > > > [15/Jun/2010:11:30:03 +0200] - CentOS-Directory/8.1.0 B2009.134.1334 > > starting up > > [15/Jun/2010:11:30:03 +0200] - attrcrypt_unwrap_key: failed to unwrap > > key for cipher AES > > [15/Jun/2010:11:30:03 +0200] - Failed to retrieve key for cipher AES > > in attrcrypt_cipher_init > > [15/Jun/2010:11:30:03 +0200] - Failed to initialize cipher AES in > > attrcrypt_init > > [15/Jun/2010:11:30:03 +0200] - attrcrypt_unwrap_key: failed to unwrap > > key for cipher AES > > [15/Jun/2010:11:30:03 +0200] - Failed to retrieve key for cipher AES > > in attrcrypt_cipher_init > > [15/Jun/2010:11:30:03 +0200] - Failed to initialize cipher AES in > > attrcrypt_init > > [15/Jun/2010:11:30:03 +0200] - skipping cos definition > > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > > [15/Jun/2010:11:30:03 +0200] - skipping cos definition > > cn=nsAccountInactivation_cos,dc=directory,dc=intern--no templates found > > [15/Jun/2010:11:30:03 +0200] - slapd started. Listening on All > > Interfaces port 389 for LDAP requests > > [15/Jun/2010:11:30:03 +0200] - Listening on All Interfaces port 636 > > for LDAPS requests > > [15/Jun/2010:11:40:44 +0200] - Beginning export of 'userroot' > > [15/Jun/2010:11:40:44 +0200] - export userRoot: Processed 139 entries > > (100%). > > [15/Jun/2010:11:40:44 +0200] - Export finished. > > [15/Jun/2010:11:46:12 +0200] NSMMReplicationPlugin - > > multimaster_be_state_change: replica dc=directory,dc=intern is going > > offline; disabling replication > > [15/Jun/2010:11:46:12 +0200] - attrcrypt_unwrap_key: failed to unwrap > > key for cipher AES > > [15/Jun/2010:11:46:12 +0200] - Failed to retrieve key for cipher AES > > in attrcrypt_cipher_init > > [15/Jun/2010:11:46:12 +0200] - Failed to initialize cipher AES in > > attrcrypt_init > > [15/Jun/2010:11:46:12 +0200] - WARNING: Import is running with > > nsslapd-db-private-import-mem on; No other process is allowed to > > access the database > > [15/Jun/2010:11:46:14 +0200] - libdb: page 1: illegal page type or format > > [15/Jun/2010:11:46:14 +0200] - libdb: PANIC: Invalid argument > > [15/Jun/2010:11:46:14 +0200] - FATAL ERROR at by MCC ou=people > > dc=directory dc=intern (77); server stopping as database recovery needed. > > CentOS-Directory/8.1.0 B2009.134.1334 > > <host>:<port> (/etc/dirsrv/slapd-serverb55 > > ====================================================================== > > > > Finally my questions : > > What could be the cause of the problem ? > Not sure - never seen this particular error before - any disk errors in > /var/log/messages? > > Can you try doing a reinit of this server from the other server? > > What would be the best procedure to get the serverb55 up and running > > again ? > Doing a reinit of this server from the other server. > > > > Thanks for any advise. > > > > Regards, > > Mark > > > > > > > > ======= > > > > > > On Tue, Jun 15, 2010 at 7:04 PM, Rich Megginson <rmegg...@redhat.com > > <mailto:rmegg...@redhat.com>> wrote: > > > > mark benschop wrote: > > > Hi All, > > > > > > I'm having a problem on a CentOs Directory Server 8.1 multiple > > master > > > setup. > > > The database of one of the servers has been marked as corrupt > > and has > > > been brought offline by the Directory Server. > > Can you post any relevant error messages from the error log of the > > server? > > > Ldapclients querying the ldapserver for e.g. loggin in of users > > get an > > > errormessage, effectively disabling users to log in. > > What error message? > > > > > > I'm wondering what the best method is to recover from this > > situation. > > > I can think of a few : > > > 1) Starting the ldapserver, deleting the database, recreating it > and > > > restoring a backup. > > > > > 2) Starting the ldapserver, deleting the database and > reinitialising > > > the server from the other master. > > If you reinitialize the problem server from another server, you don't > > need to delete the database, reinit will do that for you. > > > > > > Can anyone give me some hints if this wil work or would another > > > approach be better ? > > > > > > Thanks for your advise, > > > Mark > > > > > > ------------------------------------------------------------------------ > > > > > > -- > > > 389 users mailing list > > > 389-us...@lists.fedoraproject.org > > <mailto:389-us...@lists.fedoraproject.org> > > > https://admin.fedoraproject.org/mailman/listinfo/389-users > > > > -- > > 389 users mailing list > > 389-us...@lists.fedoraproject.org > > <mailto:389-us...@lists.fedoraproject.org> > > https://admin.fedoraproject.org/mailman/listinfo/389-users > > > > > > ------------------------------------------------------------------------ > > > > -- > > 389 users mailing list > > 389-us...@lists.fedoraproject.org > > https://admin.fedoraproject.org/mailman/listinfo/389-users > > -- > 389 users mailing list > 389-us...@lists.fedoraproject.org > https://admin.fedoraproject.org/mailman/listinfo/389-users >
-- 389 users mailing list 389-us...@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/389-users