Re: So I finally upgraded from slurpd...

2010-07-20 Thread masarati
> On Tue, 20 Jul 2010, masar...@aero.polimi.it wrote:
>
>>
>> > It turned out that the object cn=admin,dc=foo,dc=no had multiple
>> > occurances of "objectClass: organizationalRole" (!), and this also
>> > prevented syncrepl from working. I suspect it was a result of "manual"
>> > editing of ldif files followed by an import using slapadd. I get no
>> > warnings from slapadd when I import import objects with multiple
>> > occurances of the same objectClass.
>> >
>> > Perhaps slapadd/slapd should be able to deal with such duplicate
>> > entries better, to make it more obivous what's wrong? I'm just saying
>> > :)
>>
>> slapd(8) can handle those occurrences.
>
> But does it handle it good enough, when it prevents replsync from working?

This is a side effect: the replica receives bogus data via the protocol,
and spits it.

>> slapadd(8) is intended to load LDIF files generated by slapcat(8), thus
>> presumably consistent.
>
> And the file was indeed LDIF file generated by slapcat.

I mean: from slapcat of a sane database.

> Since slapd allows
> it, slapcat will also spit it out - when slapcat, slapadd and slapd all
> "handle it" without giving any warnings back to anyone, it's not so easy
> to detect errors.

No, you miss one link: slapd did not handle it (I mean: through protocol).
 When slapd starts up and opens a database, it does not validate its
content, of course.  And when it returns an entry, it does not validate
its contents.  Only when a write is performed, the contents are validated
(usually, only the bit that's being written, if it's a modify).

>> In general, it deals with the most obvious errors.  I don't think asking
>> slapadd to perform these checks is a good idea, as it would slow it down
>> without real benefit: if an error is caught, you would need to restart,
>> wasting all the actual write effort.
>
> I don't quite agree - as I understand it slapadd already does some sanity
> checking, how much overhead would a check for objectClass doublets imply?

Why don't you code and test it yourself?  Checking for duplicates requires
to normalize data and compare each value to eachother.  A wise
implementation has quadratic cost (n*(n-1)/2 comparisons).  You were
offended by a duplicate objectClass issue this time.  If next time it
happens to a group with 10,000 members, you'll be whining that your groups
are perfectly sane, why does it take so long to load your LDIF?

> And I dont see why you would need to restart, on a doublet either spit out
> a warning, or even better - spit out a warning and discard the doublet.

Those are implementation details; in many cases, the database needs to be
complete - no holes; so if slapadd spits an entry, it may not be able to
add its children.

>> A sanity check tool for unreliable LDIF would probably be more
>> appropriate.  I guess at this point most users would pretend their LDIF
>> is always reliable, and avoid running the sanity checker...
>
> Really? Yes, I would love a sanity checker, and I would most likely
> _always_ run LDIF through a sanity checker before using slapadd to write
> to back-end.
>
> But again - slapadd already does some sanity checking,

Usually, as much as it's strictly required to properly perform its own
task - regenerate a presumably sane database.

> and there's even a
> flag for "dry-run" mode (-u) which IMO says that it is supposed to be used
> as a sanity checking tool. I'm perfectly OK to let _all_ sanity checks
> only occure when using -u.

Embedding the sanity checker in slapadd is an option, indeed.  Not the
default, IMHO.

> I would love to dump all my ldap data to an LDIF and run it through a
> sanity checker, I suspect there's more "old noise" stuck in there.

Task separation is at the roots of clean programming - and system 
administration.

p.



Re: So I finally upgraded from slurpd...

2010-07-20 Thread Kolbjørn Barmen
On Tue, 20 Jul 2010, masar...@aero.polimi.it wrote:

> 
> > It turned out that the object cn=admin,dc=foo,dc=no had multiple
> > occurances of "objectClass: organizationalRole" (!), and this also
> > prevented syncrepl from working. I suspect it was a result of "manual"
> > editing of ldif files followed by an import using slapadd. I get no
> > warnings from slapadd when I import import objects with multiple
> > occurances of the same objectClass.
> >
> > Perhaps slapadd/slapd should be able to deal with such duplicate
> > entries better, to make it more obivous what's wrong? I'm just saying
> > :)
> 
> slapd(8) can handle those occurrences.

But does it handle it good enough, when it prevents replsync from working?

> slapadd(8) is intended to load LDIF files generated by slapcat(8), thus
> presumably consistent.

And the file was indeed LDIF file generated by slapcat. Since slapd allows
it, slapcat will also spit it out - when slapcat, slapadd and slapd all
"handle it" without giving any warnings back to anyone, it's not so easy
to detect errors.

> In general, it deals with the most obvious errors.  I don't think asking
> slapadd to perform these checks is a good idea, as it would slow it down
> without real benefit: if an error is caught, you would need to restart,
> wasting all the actual write effort.

I don't quite agree - as I understand it slapadd already does some sanity
checking, how much overhead would a check for objectClass doublets imply?
And I dont see why you would need to restart, on a doublet either spit out
a warning, or even better - spit out a warning and discard the doublet.

> A sanity check tool for unreliable LDIF would probably be more
> appropriate.  I guess at this point most users would pretend their LDIF
> is always reliable, and avoid running the sanity checker...

Really? Yes, I would love a sanity checker, and I would most likely
_always_ run LDIF through a sanity checker before using slapadd to write
to back-end.

But again - slapadd already does some sanity checking, and there's even a
flag for "dry-run" mode (-u) which IMO says that it is supposed to be used
as a sanity checking tool. I'm perfectly OK to let _all_ sanity checks
only occure when using -u.

I would love to dump all my ldap data to an LDIF and run it through a
sanity checker, I suspect there's more "old noise" stuck in there.

Cheers! :)

-- 
Kolbjørn Barmen
UNINETT Driftsenter


Re: So I finally upgraded from slurpd...

2010-07-20 Thread masarati

> It turned out that the object cn=admin,dc=foo,dc=no had multiple
> occurances
> of "objectClass: organizationalRole" (!), and this also prevented syncrepl
> from working. I suspect it was a result of "manual" editing of ldif files
> followed by an import using slapadd. I get no warnings from slapadd when I
> import import objects with multiple occurances of the same objectClass.
>
> Perhaps slapadd/slapd should be able to deal with such duplicate entries
> better, to make it more obivous what's wrong? I'm just saying :)

slapd(8) can handle those occurrences.  slapadd(8) is intended to load
LDIF files generated by slapcat(8), thus presumably consistent.  In
general, it deals with the most obvious errors.  I don't think asking
slapadd to perform these checks is a good idea, as it would slow it down
without real benefit: if an error is caught, you would need to restart,
wasting all the actual write effort.  A sanity check tool for unreliable
LDIF would probably be more appropriate.  I guess at this point most users
would pretend their LDIF is always reliable, and avoid running the sanity
checker...

p.



Re: So I finally upgraded from slurpd...

2010-07-20 Thread Kolbjørn Barmen
On Tue, 20 Jul 2010, Quanah Gibson-Mount wrote:

> --On Tuesday, July 20, 2010 5:40 PM +0200 Kolbjørn Barmen
>  wrote:
> 
> 
> > Perhaps slapadd/slapd should be able to deal with such duplicate entries
> > better, to make it more obivous what's wrong? I'm just saying :)
> 
> 
> What options did you use with slapadd?  If you used -q, this is probably
> expected.  If you did not, I suggest filing an ITS, although slapadd is
> never really as strict as ldapadd will be.  It is meant for loading LDIF
> created by an export, which should already be sane.

I did not use -q.

ITS submitted: http://www.openldap.org/its/index.cgi/Incoming?id=6592

Thanks! :)

-- 
Kolbjørn Barmen
UNINETT Driftsenter


Re: So I finally upgraded from slurpd...

2010-07-20 Thread Quanah Gibson-Mount
--On Tuesday, July 20, 2010 5:40 PM +0200 Kolbjørn Barmen 
 wrote:




Perhaps slapadd/slapd should be able to deal with such duplicate entries
better, to make it more obivous what's wrong? I'm just saying :)



What options did you use with slapadd?  If you used -q, this is probably 
expected.  If you did not, I suggest filing an ITS, although slapadd is 
never really as strict as ldapadd will be.  It is meant for loading LDIF 
created by an export, which should already be sane.


--Quanah

--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc

Zimbra ::  the leader in open source messaging and collaboration


Re: So I finally upgraded from slurpd...

2010-07-20 Thread Kolbjørn Barmen
On Fri, 9 Jul 2010, Quanah Gibson-Mount wrote:

> --On Thursday, July 08, 2010 7:04 PM +0200 Kolbjørn Barmen
>  wrote:
> 
> > I have at last upgraded a system from using slurpd (debian etch, slapd
> > 2.3.30) to using replsync, at least that was the intention.
> 
> I believe you mean SyncRepl (Sync Replication).

Yes - ofcourse :)
  
> What version of OpenLDAP is on the master? 2.3.30?

It is running 2.4.23.

> > syncrepl rid=123
> > provider=ldaps://masterserver.uninett.no:636/
> > type=refreshOnly
> > interval=00:00:00:10
> > retry="60 +"
> > searchbase="dc=foo,dc=no"
> > scope=sub
> > schemachecking=off
> > bindmethod=simple
> > binddn="cn=admin,dc=foo,dc=no"
> > credentials="f00bAr123"
> 
> I highly advise using refreshAndPersist rather than refreshOnly.

Right! :)

> > When I start slapd on the slave I get on the slave:
> > ===
> > 18:37:50 server.foo.no slapd[7971]: @(#) $OpenLDAP: slapd 2.4.23 (Jul  5
> > 2010 18:35:50) $
> > ^ir...@localhost:/home/kolla/openldap/openldap-2.4.23/debian/build/server
> > s/slapd 18:37:50 server.foo.no slapd[7972]: slapd starting
> > 18:37:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123
> > mods check (objectClass: value #0 provided more than once) 18:37:50
> > server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying 18:38:50
> > server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods check
> > (objectClass: value #0 provided more than once) 18:38:50 server.foo.no
> > slapd[7972]: do_syncrepl: rid=123 rc 20 retrying ===
> 
> 
> I would advise you start the slave with the "-d -1" options passed to the
> slapd binary, so you can see exactly what the master is sending to the
> replica.  It sounds like it is sending invalid data.  This could be a bug in
> the version that the master is running. You might want to try running a
> separate master for testing, that uses OpenLDAP 2.4.23 as well.  There have
> been a multitude of fixes to the syncrepl code in OpenLDAP since the 2.3
> series.

Both slave and master are running 2.4.23.

After som debugging I found the culprit, turned out the error message 
"(objectClass: value #0 provided more than once)" was a nice hint.
(allthought I don't quite see what "value ¤0" is supposed to tell me)

Just by coincident I tried to change the object "cn=admin,dc=foo,dc=no"
on the master with an ldap editor (gq), and got the same error message in
return.

It turned out that the object cn=admin,dc=foo,dc=no had multiple occurances
of "objectClass: organizationalRole" (!), and this also prevented syncrepl
from working. I suspect it was a result of "manual" editing of ldif files
followed by an import using slapadd. I get no warnings from slapadd when I
import import objects with multiple occurances of the same objectClass.

Perhaps slapadd/slapd should be able to deal with such duplicate entries
better, to make it more obivous what's wrong? I'm just saying :)

Thanks!

-- 
Kolbjørn Barmen
UNINETT Driftsenter


Re: So I finally upgraded from slurpd...

2010-07-09 Thread Quanah Gibson-Mount
--On Thursday, July 08, 2010 7:04 PM +0200 Kolbjørn Barmen 
 wrote:



I have at last upgraded a system from using slurpd (debian etch, slapd
2.3.30) to using replsync, at least that was the intention.


I believe you mean SyncRepl (Sync Replication).

What version of OpenLDAP is on the master? 2.3.30?


===

And on slave:
===




# updateref   ldap://masterserver.uninett.no/


I'd still set updateref, so clients know where they should send updates.


syncrepl rid=123
provider=ldaps://masterserver.uninett.no:636/
type=refreshOnly
interval=00:00:00:10
retry="60 +"
searchbase="dc=foo,dc=no"
scope=sub
schemachecking=off
bindmethod=simple
binddn="cn=admin,dc=foo,dc=no"
credentials="f00bAr123"


I highly advise using refreshAndPersist rather than refreshOnly.


When I start slapd on the slave I get on the slave:
===
18:37:50 server.foo.no slapd[7971]: @(#) $OpenLDAP: slapd 2.4.23 (Jul  5
2010 18:35:50) $
^ir...@localhost:/home/kolla/openldap/openldap-2.4.23/debian/build/server
s/slapd 18:37:50 server.foo.no slapd[7972]: slapd starting
18:37:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123
mods check (objectClass: value #0 provided more than once) 18:37:50
server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying 18:38:50
server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods check
(objectClass: value #0 provided more than once) 18:38:50 server.foo.no
slapd[7972]: do_syncrepl: rid=123 rc 20 retrying ===



I would advise you start the slave with the "-d -1" options passed to the 
slapd binary, so you can see exactly what the master is sending to the 
replica.  It sounds like it is sending invalid data.  This could be a bug 
in the version that the master is running. You might want to try running a 
separate master for testing, that uses OpenLDAP 2.4.23 as well.  There have 
been a multitude of fixes to the syncrepl code in OpenLDAP since the 2.3 
series.



--Quanah

--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc

Zimbra ::  the leader in open source messaging and collaboration


Re: Subject: So I finally upgraded from slurpd...

2010-07-09 Thread Quanah Gibson-Mount
--On Friday, July 09, 2010 9:58 AM -0400 "Sotomayor, Vicente (ITD)" 
 wrote:



Since you have 30 consumers, I believe each one needs a different
syncrepl rid and the error string of "value #0 provided more than once"
might be related to that, so maybe you need to change them on each
consumer?


This information is incorrect.  RIDs are unique inside a consumer. 
Different consumers can use the same rid with no issue.


--Quanah


--

Quanah Gibson-Mount
Principal Software Engineer
Zimbra, Inc

Zimbra ::  the leader in open source messaging and collaboration


Re: Subject: So I finally upgraded from slurpd...

2010-07-09 Thread Kolbjørn Barmen
On Fri, 9 Jul 2010, Sotomayor, Vicente (ITD) wrote:

> >When I start slapd on the slave I get on the slave:
> >===
> >18:37:50 server.foo.no slapd[7971]: @(#) $OpenLDAP: slapd 2.4.23 (Jul  5 
> >2010 18:35:50) $ 
> >^ir...@localhost:/home/kolla/openldap/openldap-2.4.23/debian>/build/servers/slapd
> >18:37:50 server.foo.no slapd[7972]: slapd starting
> >18:37:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods 
> >check (objectClass: value #0 provided more than once)
> >18:37:50 server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying
> >18:38:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods 
> >check (objectClass: value #0 provided more than once)
> >18:38:50 server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying
> 
> Since you have 30 consumers, I believe each one needs a different
> syncrepl rid and the error string of "value #0 provided more than once"
> might be related to that, so maybe you need to change them on each
> consumer? 

Sure, when I get that far - till now I have only experimented with one
consumer, and set the rid on that one to 123, so I dont think it is
related.

I figured I'd better have it fully working on one consumer before I mess
with the others.

> Good luck

Thanks, I need it :)

-- 
Kolbjørn Barmen
UNINETT Driftsenter


Re: Subject: So I finally upgraded from slurpd...

2010-07-09 Thread Sotomayor, Vicente (ITD)
>When I start slapd on the slave I get on the slave:
>===
>18:37:50 server.foo.no slapd[7971]: @(#) $OpenLDAP: slapd 2.4.23 (Jul  5 2010 
>18:35:50) $ 
>^ir...@localhost:/home/kolla/openldap/openldap-2.4.23/debian>/build/servers/slapd
>18:37:50 server.foo.no slapd[7972]: slapd starting
>18:37:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods 
>check (objectClass: value #0 provided more than once)
>18:37:50 server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying
>18:38:50 server.foo.no slapd[7972]: syncrepl_message_to_entry: rid=123 mods 
>check (objectClass: value #0 provided more than once)
>18:38:50 server.foo.no slapd[7972]: do_syncrepl: rid=123 rc 20 retrying

Since you have 30 consumers, I believe each one needs a different syncrepl rid 
and the error string of "value #0 provided more than once" might be related to 
that, so maybe you need to change them on each consumer? 

Good luck



So I finally upgraded from slurpd...

2010-07-09 Thread Kolbjørn Barmen

Hello

Let me start by introducing myself briefly:
* I'm a sysadm at the Norwegian NREN (UNINETT)
* As sysadm here, I end up doing a heck lot of things at once,
  and our LDAP servers are really just tiny (but important) pieces
  in our huge puzzle that is our network infrastructure. Point being,
  I dont have the capasity to "know it all", or RTFM (and comprehend)
  everything I'm involved with, things just tend to drop into my lap.
* I'm not paricularly interested or fond of LDAP as such :)
* I have also no particular interest in diving into LDAP and
  become some sort of LDAP-wizzard in the future

I just felt mentioning the above before anyone throw me an RTFM ;)

Anyways...

I have at last upgraded a system from using slurpd (debian etch, slapd
2.3.30) to using replsync, at least that was the intention.

Let me start with the scenario:

* one master LDAP-server with a web frontend (old modified GOsa)

* 30ish slave LDAP-servers spread around on various institutions

* on the master LDAP-server, each institution has its own branch, like
  dc=foo,dc=no ; dc=bar,dc=no etc.

* each slave is supposed to replacicate only its own branch, for exmple
  server.foo.no only has dc=foo,dc=no replicated from the master

* for each dc=foo,dc=no there is an admin user, eg. cn=admin,dc=foo,dc=no
  that has all rights granted to the according subtree dc=foo,dc=no

That in all its simplity i the scenario.

With slurpd I had in masters slapd.conf entries like this:

replica host="server.foo.no"
suffix="dc=foo,dc=no"
binddn="cn=admin,dc=foo,dc=no"
credentials="f00bAr123"
bindmethod="simple"
tls="critical"

and on the slaves, (running 2.4.23, they were upgraded some time ago):

updatedn"cn=admin,dc=foo,dc=no"
updateref   ldap://masterserver.uninett.no/


This worked fine, apart from occations of out-of-sync every now and then.

Now, with upgraded master - I have yet to get any replication working.

Which sync method is most likely the best for my scenario?


On master I have added:
===
...
moduleload  syncprov.la
moduleload  back_ldap.la
...

# and under database, it looks like this:

databasebdb
suffix  "dc=no"
directory   "/var/lib/ldap"
rootdn  "cn=root,dc=no"
rootpw  {SMD5}XXX=
index   objectClass eq
index   uid,gidNumber,uidNumber,memberUid   pres,eq
index   mail,gosaMailAlternateAddress   pres,eq,sub
index   gosaUser,gosaObject pres,eq,sub
index   zoneName,relativeDomaiNname pres,eq
lastmod on
sizelimit   4000
overlay syncprov
syncprov-checkpoint 1000 60
syncprov-sessionlog 100

# and some access lists

access to dn.regex="dc=([^,]+),dc=no$"

attrs=userPassword,sambaLMPassword,sambaNTPassword,goImapPassword
by dn.regex="^cn=admin,dc=$1,dc=no$" write
by anonymous auth
by self write
by * none

access to dn.base="" by * read

access to dn.regex="dc=([^,]+),dc=no$"
by dn.regex="^cn=admin,dc=$1,dc=no$" write
by * read
===

And on slave:
===
...
...

databasebdb
suffix  "dc=foo,dc=no"
directory   "/var/lib/ldap"
rootdn  "cn=admin,dc=foo,dc=no"
rootpw  {SMD5}XXX=
index   objectclass,entryCSN,entryUUID eq
index   uid,gidNumber,uidNumber,memberUid pres,eq
index   mail,gosaMailAlternateAddress pres,eq,sub
lastmod on
sizelimit   4000
# updatedn"cn=admin,dc=foo,dc=no"
# updateref   ldap://masterserver.uninett.no/

syncrepl rid=123
provider=ldaps://masterserver.uninett.no:636/
type=refreshOnly
interval=00:00:00:10
retry="60 +"
searchbase="dc=foo,dc=no"
scope=sub
schemachecking=off
bindmethod=simple
binddn="cn=admin,dc=foo,dc=no"
credentials="f00bAr123"

access to attrs=userPassword,sambaLMPassword,sambaNTPassword,goImapPassword
by anonymous auth
by self write
by * none

access to dn.base="" by * read

access to *
by * read

===


I have (in good tradition) done a slapcat of the subtree dc=foo,dc=no on the 
master and copied over the ldif to the slave, and there used slapadd to create
the entire database from scratch, so the content is identical.

When I start slapd on the slave I get on the slave:
===
18:37:50 server.foo.no slapd[7971]: @(#) $OpenLDAP: slapd 2.4.23 (Jul  5 2010 
18:35:50) $ 
^ir...@localhost:/home/kolla/openldap/openldap-2.4.23/debian/build/servers/slapd
18:37:50 server.foo.no slapd[7972]: slapd starting
18:37:50 server.foo.no slapd[7972]: syncrepl_messag