Re: syncrepl slaves all quit after master restart - not a single retry

2010-08-12 Thread Nick Urbanik

Dear Alex,

On 28/07/10 18:57 -0400, Alexander Ivanov wrote:

Hello guys, I have a problem with delta-syn replication (all set up
according to 'official' guide
- http://www.openldap.org/doc/admin24/replication.html#Delta-syncrepl
I have master instance with logs 'shipped' to a client - it all works
fine as long as connection is good.  Getting ready to move into
production I'm trying to emulate connectivity problems and here where
I got problems.


[snip]


once I have server disconnected (I sumply restart slapd on master), the client
not even tries to re-connect, the log below shows modificatin operation at
18:34:18 that went fine and 11 seconds later I restart master's ldap service
(which became immediately available again):


I am having the same trouble, but with ordinary syncrepl.  As soon as
the master is restarted, the slaves all quit their syncrepl threads, and
never start  again:

Aug 12 08:58:00 ldapro04 slapd[9166]: do_syncrep2: rid 003 Can't contact LDAP server 
Aug 12 08:58:00 ldapro04 slapd[9166]: do_syncrepl: rid 003 quitting 


This is a serious barrier to deployment in a busy production
environment with many slaves.


Jul 28 18:34:29 newton slapd[20353]: do_syncrepl: rid 101 quitting

I'm running openldap 2.3.43-12.el5_5.1 from standard CentOS 5.4
installation.


I am running the same openldap as you, on CentOS 5.5.


Do I get something wrong and slave not supposed to re-connect after
master service restart or is this some kind of a problem that was
fixed in later versions?


I have exactly the same question.  I don't think Alex and I are the
only ones with this situation.

slapd.conf on provider:
===

# slapd.conf generated by /usr/bin/conform

include  /etc/openldap/schema/core.schema
include  /etc/openldap/schema/cosine.schema
include  /etc/openldap/schema/inetorgperson.schema
include  /etc/openldap/schema/nis.schema
include  /etc/openldap/schema/local.schema

loglevel stats sync
allowbind_v2
pidfile  /var/run/openldap/slapd.pid
argsfile /var/run/openldap/slapd.args
tool-threads 4
modulepath   /usr/lib64/openldap


# GLOBAL database definition


access to dn.base=
by * read

access to dn.base=cn=Subschema
by * read


# ou=tree,ou=name database definition


database bdb
suffix   ou=tree,ou=name
rootdn   cn=manager,ou=tree,ou=name
rootpw   root-password
directory/var/lib/ldap/ou=tree,ou=name
indexentryCSN eq
indexentryUUID eq
indexobjectClass eq
indexuid eq
indexusername eq

cachesize100
idlcachesize 100
checkpoint   65536 240
idletimeout  300
writetimeout 9
limits   dn.base=cn=syncrepl,ou=tree,ou=name
size.soft=unlimited
size.hard=unlimited
time.soft=unlimited
time.hard=unlimited

access to dn.subtree=ou=tree,ou=name
by dn=cn=syncrepl,ou=tree,ou=name read
by peername.ip=227.137.34.172 read
by peername.ip=209.146.228.56 read
by peername.ip=147.107.14.11 read
by peername.ip=127.0.0.1 read
by * none break

access to dn.subtree=ou=tree,ou=name attrs=userPassword
by anonymous auth
by * none break


overlay  syncprov
checkpoint   1000 5
sessionlog   10

slapd.conf on consumer:
===

# slapd.conf generated by /usr/bin/conform

include  /etc/openldap/schema/core.schema
include  /etc/openldap/schema/cosine.schema
include  /etc/openldap/schema/inetorgperson.schema
include  /etc/openldap/schema/nis.schema
include  /etc/openldap/schema/local.schema

loglevel stats sync
allowbind_v2
pidfile  /var/run/openldap/slapd.pid
argsfile /var/run/openldap/slapd.args
tool-threads 8


# GLOBAL database definition


access to dn.base=
by * read

access to dn.base=cn=Subschema
by * read


# ou=tree,ou=name database definition


database bdb
suffix   ou=tree,ou=name
rootdn   cn=manager,ou=tree,ou=name
rootpw   root-password
directory/var/lib/ldap/ou=tree,ou=name
indexentryCSN eq
indexentryUUID eq
indexobjectClass eq
indexuid eq
indexusername eq

cachesize100
idlcachesize 100
checkpoint   65536 240
idletimeout  300
writetimeout 9

access to dn.subtree=ou=tree,ou=name
by peername.ip=49.66.187.43 read
by peername.ip=139.243.36.117 read
by peername.ip=115.165.210.17 read
by peername.ip=25.79.141.72%255.255.255.0 read
by peername.ip=127.0.0.1 read
by * none break

access to 

Re: syncrepl slaves all quit after master restart - not a single retry

2010-08-12 Thread masarati

 syncrepl rid=003
  provider=ldap://master:389
  type=refreshAndPersist
  bindmethod=simple
  binddn=cn=syncrepl,ou=tree,ou=name
  credentials=syncrepl-password
  searchbase=ou=tree,ou=name

There is no retry here.  See slapd.conf(5) and the admin guide for
indications about how syncrepl should be configured.

p.



Re: syncrepl slaves all quit after master restart - not a single retry

2010-08-12 Thread Howard Chu

Nick Urbanik wrote:

Dear Alex,

On 28/07/10 18:57 -0400, Alexander Ivanov wrote:

Hello guys, I have a problem with delta-syn replication (all set up
according to 'official' guide
- http://www.openldap.org/doc/admin24/replication.html#Delta-syncrepl

I'm running openldap 2.3.43-12.el5_5.1 from standard CentOS 5.4
installation.

I am running the same openldap as you, on CentOS 5.5.


It's generally a mistake to read the docs for a different version of the 
software than you're actually running.



I have master instance with logs 'shipped' to a client - it all works
fine as long as connection is good.  Getting ready to move into
production I'm trying to emulate connectivity problems and here where
I got problems.


[snip]


once I have server disconnected (I sumply restart slapd on master), the client
not even tries to re-connect, the log below shows modificatin operation at
18:34:18 that went fine and 11 seconds later I restart master's ldap service
(which became immediately available again):


I am having the same trouble, but with ordinary syncrepl.  As soon as
the master is restarted, the slaves all quit their syncrepl threads, and
never start  again:



syncrepl rid=003
  provider=ldap://master:389
  type=refreshAndPersist
  bindmethod=simple
  binddn=cn=syncrepl,ou=tree,ou=name
  credentials=syncrepl-password
  searchbase=ou=tree,ou=name

If you see any problems with these configuration files, please let me
know, even if they do not relate to the problem of syncrepl
terminating after master is restarted.


You have no retry parameter in your syncrepl config, so naturally it does 
not retry. It always helps to actually Read The correct FM, slapd.conf(5) in 
your case.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/


Re: syncrepl slaves all quit after master restart - not a single retry

2010-08-12 Thread masarati

 You have no retry parameter in your syncrepl config, so naturally it
 does
 not retry. It always helps to actually Read The correct FM, slapd.conf(5)
 in
 your case.

I'd also note that slapd will issue

syncrepl rid=003 searchbase=ou=tree,ou=name: no retry defined, using
default

if no retry is configured; one should at least wonder what that message
means.  I'd favor refusing to start if no retry is configured, since
replication is not reliable without.

p.



Re: syncrepl slaves all quit after master restart - not a single retry

2010-08-12 Thread Howard Chu

masar...@aero.polimi.it wrote:



You have no retry parameter in your syncrepl config, so naturally it
does
not retry. It always helps to actually Read The correct FM, slapd.conf(5)
in
your case.


I'd also note that slapd will issue

syncrepl rid=003 searchbase=ou=tree,ou=name: no retry defined, using
default

if no retry is configured; one should at least wonder what that message
means.  I'd favor refusing to start if no retry is configured, since
replication is not reliable without.


That message was added in 2.4, these guys are using 2.3. At this point I've 
grown tired of telling people you're using an obsolete release, you should 
upgrade.


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/


Re: syncrepl slaves all quit after master restart - not a single retry

2010-08-12 Thread Nick Urbanik

Dear Masarati,

On 13/08/10 02:44 +0200, masar...@aero.polimi.it wrote:



You have no retry parameter in your syncrepl config, so naturally
it does not retry. It always helps to actually Read The correct FM,
slapd.conf(5) in your case.


Bless you, thank you very much for that help.


I'd also note that slapd will issue

syncrepl rid=003 searchbase=ou=tree,ou=name: no retry defined, using
default

if no retry is configured; one should at least wonder what that
message means.  I'd favor refusing to start if no retry is
configured, since replication is not reliable without.


Yes, that makes sense.

[r...@ldapro04.syd ~]# grep -P '\bretry' /var/log/ldap*
[r...@ldapro04.syd ~]# 


No such error message seems to be present.
--
Nick Urbanik http://nicku.org 808-71011 nick.urba...@optusnet.com.au
GPG: 7FFA CDC7 5A77 0558 DC7A 790A 16DF EC5B BB9D 2C24  ID: BB9D2C24
I disclaim, therefore I am.


Re: syncrepl slaves all quit after master restart - not a single retry

2010-08-12 Thread Nick Urbanik

Dear Howard,

On 12/08/10 17:34 -0700, Howard Chu wrote:
You have no retry parameter in your syncrepl config, so naturally 
it does not retry. It always helps to actually Read The correct FM, 
slapd.conf(5) in your case.


Thank you very much indeed for your very helpful, prompt and accurate
reply!  I will happily buy you a beer or beverage of your choice if I
see you at Linuxconf or elsewhere.
--
Nick Urbanik http://nicku.org 808-71011 nick.urba...@optusnet.com.au
GPG: 7FFA CDC7 5A77 0558 DC7A 790A 16DF EC5B BB9D 2C24  ID: BB9D2C24
I disclaim, therefore I am.