> On Tue, 20 Jul 2010, masar...@aero.polimi.it wrote: > >> >> > It turned out that the object cn=admin,dc=foo,dc=no had multiple >> > occurances of "objectClass: organizationalRole" (!), and this also >> > prevented syncrepl from working. I suspect it was a result of "manual" >> > editing of ldif files followed by an import using slapadd. I get no >> > warnings from slapadd when I import import objects with multiple >> > occurances of the same objectClass. >> > >> > Perhaps slapadd/slapd should be able to deal with such duplicate >> > entries better, to make it more obivous what's wrong? I'm just saying >> > :) >> >> slapd(8) can handle those occurrences. > > But does it handle it good enough, when it prevents replsync from working?
This is a side effect: the replica receives bogus data via the protocol, and spits it. >> slapadd(8) is intended to load LDIF files generated by slapcat(8), thus >> presumably consistent. > > And the file was indeed LDIF file generated by slapcat. I mean: from slapcat of a sane database. > Since slapd allows > it, slapcat will also spit it out - when slapcat, slapadd and slapd all > "handle it" without giving any warnings back to anyone, it's not so easy > to detect errors. No, you miss one link: slapd did not handle it (I mean: through protocol). When slapd starts up and opens a database, it does not validate its content, of course. And when it returns an entry, it does not validate its contents. Only when a write is performed, the contents are validated (usually, only the bit that's being written, if it's a modify). >> In general, it deals with the most obvious errors. I don't think asking >> slapadd to perform these checks is a good idea, as it would slow it down >> without real benefit: if an error is caught, you would need to restart, >> wasting all the actual write effort. > > I don't quite agree - as I understand it slapadd already does some sanity > checking, how much overhead would a check for objectClass doublets imply? Why don't you code and test it yourself? Checking for duplicates requires to normalize data and compare each value to eachother. A wise implementation has quadratic cost (n*(n-1)/2 comparisons). You were offended by a duplicate objectClass issue this time. If next time it happens to a group with 10,000 members, you'll be whining that your groups are perfectly sane, why does it take so long to load your LDIF? > And I dont see why you would need to restart, on a doublet either spit out > a warning, or even better - spit out a warning and discard the doublet. Those are implementation details; in many cases, the database needs to be complete - no holes; so if slapadd spits an entry, it may not be able to add its children. >> A sanity check tool for unreliable LDIF would probably be more >> appropriate. I guess at this point most users would pretend their LDIF >> is always reliable, and avoid running the sanity checker... > > Really? Yes, I would love a sanity checker, and I would most likely > _always_ run LDIF through a sanity checker before using slapadd to write > to back-end. > > But again - slapadd already does some sanity checking, Usually, as much as it's strictly required to properly perform its own task - regenerate a presumably sane database. > and there's even a > flag for "dry-run" mode (-u) which IMO says that it is supposed to be used > as a sanity checking tool. I'm perfectly OK to let _all_ sanity checks > only occure when using -u. Embedding the sanity checker in slapadd is an option, indeed. Not the default, IMHO. > I would love to dump all my ldap data to an LDIF and run it through a > sanity checker, I suspect there's more "old noise" stuck in there. Task separation is at the roots of clean programming - and system administration. p.