Re: So I finally upgraded from slurpd...
> On Tue, 20 Jul 2010, masar...@aero.polimi.it wrote: > >> >> > It turned out that the object cn=admin,dc=foo,dc=no had multiple >> > occurances of "objectClass: organizationalRole" (!), and this also >> > prevented syncrepl from working. I suspect it was a result of "manual" >> > editing of ldif files followed by an import using slapadd. I get no >> > warnings from slapadd when I import import objects with multiple >> > occurances of the same objectClass. >> > >> > Perhaps slapadd/slapd should be able to deal with such duplicate >> > entries better, to make it more obivous what's wrong? I'm just saying >> > :) >> >> slapd(8) can handle those occurrences. > > But does it handle it good enough, when it prevents replsync from working? This is a side effect: the replica receives bogus data via the protocol, and spits it. >> slapadd(8) is intended to load LDIF files generated by slapcat(8), thus >> presumably consistent. > > And the file was indeed LDIF file generated by slapcat. I mean: from slapcat of a sane database. > Since slapd allows > it, slapcat will also spit it out - when slapcat, slapadd and slapd all > "handle it" without giving any warnings back to anyone, it's not so easy > to detect errors. No, you miss one link: slapd did not handle it (I mean: through protocol). When slapd starts up and opens a database, it does not validate its content, of course. And when it returns an entry, it does not validate its contents. Only when a write is performed, the contents are validated (usually, only the bit that's being written, if it's a modify). >> In general, it deals with the most obvious errors. I don't think asking >> slapadd to perform these checks is a good idea, as it would slow it down >> without real benefit: if an error is caught, you would need to restart, >> wasting all the actual write effort. > > I don't quite agree - as I understand it slapadd already does some sanity > checking, how much overhead would a check for objectClass doublets imply? Why don't you code and test it yourself? Checking for duplicates requires to normalize data and compare each value to eachother. A wise implementation has quadratic cost (n*(n-1)/2 comparisons). You were offended by a duplicate objectClass issue this time. If next time it happens to a group with 10,000 members, you'll be whining that your groups are perfectly sane, why does it take so long to load your LDIF? > And I dont see why you would need to restart, on a doublet either spit out > a warning, or even better - spit out a warning and discard the doublet. Those are implementation details; in many cases, the database needs to be complete - no holes; so if slapadd spits an entry, it may not be able to add its children. >> A sanity check tool for unreliable LDIF would probably be more >> appropriate. I guess at this point most users would pretend their LDIF >> is always reliable, and avoid running the sanity checker... > > Really? Yes, I would love a sanity checker, and I would most likely > _always_ run LDIF through a sanity checker before using slapadd to write > to back-end. > > But again - slapadd already does some sanity checking, Usually, as much as it's strictly required to properly perform its own task - regenerate a presumably sane database. > and there's even a > flag for "dry-run" mode (-u) which IMO says that it is supposed to be used > as a sanity checking tool. I'm perfectly OK to let _all_ sanity checks > only occure when using -u. Embedding the sanity checker in slapadd is an option, indeed. Not the default, IMHO. > I would love to dump all my ldap data to an LDIF and run it through a > sanity checker, I suspect there's more "old noise" stuck in there. Task separation is at the roots of clean programming - and system administration. p.
Re: So I finally upgraded from slurpd...
On Tue, 20 Jul 2010, masar...@aero.polimi.it wrote: > > > It turned out that the object cn=admin,dc=foo,dc=no had multiple > > occurances of "objectClass: organizationalRole" (!), and this also > > prevented syncrepl from working. I suspect it was a result of "manual" > > editing of ldif files followed by an import using slapadd. I get no > > warnings from slapadd when I import import objects with multiple > > occurances of the same objectClass. > > > > Perhaps slapadd/slapd should be able to deal with such duplicate > > entries better, to make it more obivous what's wrong? I'm just saying > > :) > > slapd(8) can handle those occurrences. But does it handle it good enough, when it prevents replsync from working? > slapadd(8) is intended to load LDIF files generated by slapcat(8), thus > presumably consistent. And the file was indeed LDIF file generated by slapcat. Since slapd allows it, slapcat will also spit it out - when slapcat, slapadd and slapd all "handle it" without giving any warnings back to anyone, it's not so easy to detect errors. > In general, it deals with the most obvious errors. I don't think asking > slapadd to perform these checks is a good idea, as it would slow it down > without real benefit: if an error is caught, you would need to restart, > wasting all the actual write effort. I don't quite agree - as I understand it slapadd already does some sanity checking, how much overhead would a check for objectClass doublets imply? And I dont see why you would need to restart, on a doublet either spit out a warning, or even better - spit out a warning and discard the doublet. > A sanity check tool for unreliable LDIF would probably be more > appropriate. I guess at this point most users would pretend their LDIF > is always reliable, and avoid running the sanity checker... Really? Yes, I would love a sanity checker, and I would most likely _always_ run LDIF through a sanity checker before using slapadd to write to back-end. But again - slapadd already does some sanity checking, and there's even a flag for "dry-run" mode (-u) which IMO says that it is supposed to be used as a sanity checking tool. I'm perfectly OK to let _all_ sanity checks only occure when using -u. I would love to dump all my ldap data to an LDIF and run it through a sanity checker, I suspect there's more "old noise" stuck in there. Cheers! :) -- Kolbjørn Barmen UNINETT Driftsenter
Re: So I finally upgraded from slurpd...
> It turned out that the object cn=admin,dc=foo,dc=no had multiple > occurances > of "objectClass: organizationalRole" (!), and this also prevented syncrepl > from working. I suspect it was a result of "manual" editing of ldif files > followed by an import using slapadd. I get no warnings from slapadd when I > import import objects with multiple occurances of the same objectClass. > > Perhaps slapadd/slapd should be able to deal with such duplicate entries > better, to make it more obivous what's wrong? I'm just saying :) slapd(8) can handle those occurrences. slapadd(8) is intended to load LDIF files generated by slapcat(8), thus presumably consistent. In general, it deals with the most obvious errors. I don't think asking slapadd to perform these checks is a good idea, as it would slow it down without real benefit: if an error is caught, you would need to restart, wasting all the actual write effort. A sanity check tool for unreliable LDIF would probably be more appropriate. I guess at this point most users would pretend their LDIF is always reliable, and avoid running the sanity checker... p.
Re: So I finally upgraded from slurpd...
On Tue, 20 Jul 2010, Quanah Gibson-Mount wrote: > --On Tuesday, July 20, 2010 5:40 PM +0200 Kolbjørn Barmen > wrote: > > > > Perhaps slapadd/slapd should be able to deal with such duplicate entries > > better, to make it more obivous what's wrong? I'm just saying :) > > > What options did you use with slapadd? If you used -q, this is probably > expected. If you did not, I suggest filing an ITS, although slapadd is > never really as strict as ldapadd will be. It is meant for loading LDIF > created by an export, which should already be sane. I did not use -q. ITS submitted: http://www.openldap.org/its/index.cgi/Incoming?id=6592 Thanks! :) -- Kolbjørn Barmen UNINETT Driftsenter
Re: So I finally upgraded from slurpd...
--On Tuesday, July 20, 2010 5:40 PM +0200 Kolbjørn Barmen wrote: Perhaps slapadd/slapd should be able to deal with such duplicate entries better, to make it more obivous what's wrong? I'm just saying :) What options did you use with slapadd? If you used -q, this is probably expected. If you did not, I suggest filing an ITS, although slapadd is never really as strict as ldapadd will be. It is meant for loading LDIF created by an export, which should already be sane. --Quanah -- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc Zimbra :: the leader in open source messaging and collaboration
Re: So I finally upgraded from slurpd...
On Fri, 9 Jul 2010, Quanah Gibson-Mount wrote: > --On Thursday, July 08, 2010 7:04 PM +0200 Kolbjørn Barmen > wrote: > > > I have at last upgraded a system from using slurpd (debian etch, slapd > > 2.3.30) to using replsync, at least that was the intention. > > I believe you mean SyncRepl (Sync Replication). Yes - ofcourse :) > What version of OpenLDAP is on the master? 2.3.30? It is running 2.4.23. > > syncrepl rid=123 > > provider=ldaps://masterserver.uninett.no:636/ > > type=refreshOnly > > interval=00:00:00:10 > > retry="60 +" > > searchbase="dc=foo,dc=no" > > scope=sub > > schemachecking=off > > bindmethod=simple > > binddn="cn=admin,dc=foo,dc=no" > > credentials="f00bAr123" > > I highly advise using refreshAndPersist rather than refreshOnly. Right! :) > > When I start slapd on the slave I get on the slave: > > === > > 18:37:50 server.foo.no slapd[7971]: @(#) $OpenLDAP: slapd 2.4.23 (Jul 5 > > 2010 18:35:50) $ > > ^ir...@localhost:/home/kolla/openldap/openldap-2.4.23/debian/build/server > > s/slapd 18:37:50 server.foo.no slapd[7972]: slapd starting > > 18:37:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 > > mods check (objectClass: value #0 provided more than once) 18:37:50 > > server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying 18:38:50 > > server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods check > > (objectClass: value #0 provided more than once) 18:38:50 server.foo.no > > slapd[7972]: do_syncrepl: rid=123 rc 20 retrying === > > > I would advise you start the slave with the "-d -1" options passed to the > slapd binary, so you can see exactly what the master is sending to the > replica. It sounds like it is sending invalid data. This could be a bug in > the version that the master is running. You might want to try running a > separate master for testing, that uses OpenLDAP 2.4.23 as well. There have > been a multitude of fixes to the syncrepl code in OpenLDAP since the 2.3 > series. Both slave and master are running 2.4.23. After som debugging I found the culprit, turned out the error message "(objectClass: value #0 provided more than once)" was a nice hint. (allthought I don't quite see what "value ¤0" is supposed to tell me) Just by coincident I tried to change the object "cn=admin,dc=foo,dc=no" on the master with an ldap editor (gq), and got the same error message in return. It turned out that the object cn=admin,dc=foo,dc=no had multiple occurances of "objectClass: organizationalRole" (!), and this also prevented syncrepl from working. I suspect it was a result of "manual" editing of ldif files followed by an import using slapadd. I get no warnings from slapadd when I import import objects with multiple occurances of the same objectClass. Perhaps slapadd/slapd should be able to deal with such duplicate entries better, to make it more obivous what's wrong? I'm just saying :) Thanks! -- Kolbjørn Barmen UNINETT Driftsenter
Re: So I finally upgraded from slurpd...
--On Thursday, July 08, 2010 7:04 PM +0200 Kolbjørn Barmen wrote: I have at last upgraded a system from using slurpd (debian etch, slapd 2.3.30) to using replsync, at least that was the intention. I believe you mean SyncRepl (Sync Replication). What version of OpenLDAP is on the master? 2.3.30? === And on slave: === # updateref ldap://masterserver.uninett.no/ I'd still set updateref, so clients know where they should send updates. syncrepl rid=123 provider=ldaps://masterserver.uninett.no:636/ type=refreshOnly interval=00:00:00:10 retry="60 +" searchbase="dc=foo,dc=no" scope=sub schemachecking=off bindmethod=simple binddn="cn=admin,dc=foo,dc=no" credentials="f00bAr123" I highly advise using refreshAndPersist rather than refreshOnly. When I start slapd on the slave I get on the slave: === 18:37:50 server.foo.no slapd[7971]: @(#) $OpenLDAP: slapd 2.4.23 (Jul 5 2010 18:35:50) $ ^ir...@localhost:/home/kolla/openldap/openldap-2.4.23/debian/build/server s/slapd 18:37:50 server.foo.no slapd[7972]: slapd starting 18:37:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods check (objectClass: value #0 provided more than once) 18:37:50 server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying 18:38:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods check (objectClass: value #0 provided more than once) 18:38:50 server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying === I would advise you start the slave with the "-d -1" options passed to the slapd binary, so you can see exactly what the master is sending to the replica. It sounds like it is sending invalid data. This could be a bug in the version that the master is running. You might want to try running a separate master for testing, that uses OpenLDAP 2.4.23 as well. There have been a multitude of fixes to the syncrepl code in OpenLDAP since the 2.3 series. --Quanah -- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc Zimbra :: the leader in open source messaging and collaboration
Re: Subject: So I finally upgraded from slurpd...
--On Friday, July 09, 2010 9:58 AM -0400 "Sotomayor, Vicente (ITD)" wrote: Since you have 30 consumers, I believe each one needs a different syncrepl rid and the error string of "value #0 provided more than once" might be related to that, so maybe you need to change them on each consumer? This information is incorrect. RIDs are unique inside a consumer. Different consumers can use the same rid with no issue. --Quanah -- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc Zimbra :: the leader in open source messaging and collaboration
Re: Subject: So I finally upgraded from slurpd...
On Fri, 9 Jul 2010, Sotomayor, Vicente (ITD) wrote: > >When I start slapd on the slave I get on the slave: > >=== > >18:37:50 server.foo.no slapd[7971]: @(#) $OpenLDAP: slapd 2.4.23 (Jul 5 > >2010 18:35:50) $ > >^ir...@localhost:/home/kolla/openldap/openldap-2.4.23/debian>/build/servers/slapd > >18:37:50 server.foo.no slapd[7972]: slapd starting > >18:37:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods > >check (objectClass: value #0 provided more than once) > >18:37:50 server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying > >18:38:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods > >check (objectClass: value #0 provided more than once) > >18:38:50 server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying > > Since you have 30 consumers, I believe each one needs a different > syncrepl rid and the error string of "value #0 provided more than once" > might be related to that, so maybe you need to change them on each > consumer? Sure, when I get that far - till now I have only experimented with one consumer, and set the rid on that one to 123, so I dont think it is related. I figured I'd better have it fully working on one consumer before I mess with the others. > Good luck Thanks, I need it :) -- Kolbjørn Barmen UNINETT Driftsenter
Re: Subject: So I finally upgraded from slurpd...
>When I start slapd on the slave I get on the slave: >=== >18:37:50 server.foo.no slapd[7971]: @(#) $OpenLDAP: slapd 2.4.23 (Jul 5 2010 >18:35:50) $ >^ir...@localhost:/home/kolla/openldap/openldap-2.4.23/debian>/build/servers/slapd >18:37:50 server.foo.no slapd[7972]: slapd starting >18:37:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods >check (objectClass: value #0 provided more than once) >18:37:50 server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying >18:38:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods >check (objectClass: value #0 provided more than once) >18:38:50 server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying Since you have 30 consumers, I believe each one needs a different syncrepl rid and the error string of "value #0 provided more than once" might be related to that, so maybe you need to change them on each consumer? Good luck
So I finally upgraded from slurpd...
Hello Let me start by introducing myself briefly: * I'm a sysadm at the Norwegian NREN (UNINETT) * As sysadm here, I end up doing a heck lot of things at once, and our LDAP servers are really just tiny (but important) pieces in our huge puzzle that is our network infrastructure. Point being, I dont have the capasity to "know it all", or RTFM (and comprehend) everything I'm involved with, things just tend to drop into my lap. * I'm not paricularly interested or fond of LDAP as such :) * I have also no particular interest in diving into LDAP and become some sort of LDAP-wizzard in the future I just felt mentioning the above before anyone throw me an RTFM ;) Anyways... I have at last upgraded a system from using slurpd (debian etch, slapd 2.3.30) to using replsync, at least that was the intention. Let me start with the scenario: * one master LDAP-server with a web frontend (old modified GOsa) * 30ish slave LDAP-servers spread around on various institutions * on the master LDAP-server, each institution has its own branch, like dc=foo,dc=no ; dc=bar,dc=no etc. * each slave is supposed to replacicate only its own branch, for exmple server.foo.no only has dc=foo,dc=no replicated from the master * for each dc=foo,dc=no there is an admin user, eg. cn=admin,dc=foo,dc=no that has all rights granted to the according subtree dc=foo,dc=no That in all its simplity i the scenario. With slurpd I had in masters slapd.conf entries like this: replica host="server.foo.no" suffix="dc=foo,dc=no" binddn="cn=admin,dc=foo,dc=no" credentials="f00bAr123" bindmethod="simple" tls="critical" and on the slaves, (running 2.4.23, they were upgraded some time ago): updatedn"cn=admin,dc=foo,dc=no" updateref ldap://masterserver.uninett.no/ This worked fine, apart from occations of out-of-sync every now and then. Now, with upgraded master - I have yet to get any replication working. Which sync method is most likely the best for my scenario? On master I have added: === ... moduleload syncprov.la moduleload back_ldap.la ... # and under database, it looks like this: databasebdb suffix "dc=no" directory "/var/lib/ldap" rootdn "cn=root,dc=no" rootpw {SMD5}XXX= index objectClass eq index uid,gidNumber,uidNumber,memberUid pres,eq index mail,gosaMailAlternateAddress pres,eq,sub index gosaUser,gosaObject pres,eq,sub index zoneName,relativeDomaiNname pres,eq lastmod on sizelimit 4000 overlay syncprov syncprov-checkpoint 1000 60 syncprov-sessionlog 100 # and some access lists access to dn.regex="dc=([^,]+),dc=no$" attrs=userPassword,sambaLMPassword,sambaNTPassword,goImapPassword by dn.regex="^cn=admin,dc=$1,dc=no$" write by anonymous auth by self write by * none access to dn.base="" by * read access to dn.regex="dc=([^,]+),dc=no$" by dn.regex="^cn=admin,dc=$1,dc=no$" write by * read === And on slave: === ... ... databasebdb suffix "dc=foo,dc=no" directory "/var/lib/ldap" rootdn "cn=admin,dc=foo,dc=no" rootpw {SMD5}XXX= index objectclass,entryCSN,entryUUID eq index uid,gidNumber,uidNumber,memberUid pres,eq index mail,gosaMailAlternateAddress pres,eq,sub lastmod on sizelimit 4000 # updatedn"cn=admin,dc=foo,dc=no" # updateref ldap://masterserver.uninett.no/ syncrepl rid=123 provider=ldaps://masterserver.uninett.no:636/ type=refreshOnly interval=00:00:00:10 retry="60 +" searchbase="dc=foo,dc=no" scope=sub schemachecking=off bindmethod=simple binddn="cn=admin,dc=foo,dc=no" credentials="f00bAr123" access to attrs=userPassword,sambaLMPassword,sambaNTPassword,goImapPassword by anonymous auth by self write by * none access to dn.base="" by * read access to * by * read === I have (in good tradition) done a slapcat of the subtree dc=foo,dc=no on the master and copied over the ldif to the slave, and there used slapadd to create the entire database from scratch, so the content is identical. When I start slapd on the slave I get on the slave: === 18:37:50 server.foo.no slapd[7971]: @(#) $OpenLDAP: slapd 2.4.23 (Jul 5 2010 18:35:50) $ ^ir...@localhost:/home/kolla/openldap/openldap-2.4.23/debian/build/servers/slapd 18:37:50 server.foo.no slapd[7972]: slapd starting 18:37:50 server.foo.no slapd[7972]: syncrepl_messag