Re: syncrepl slaves all quit after master restart - not a single retry
Dear Alex, On 28/07/10 18:57 -0400, Alexander Ivanov wrote: Hello guys, I have a problem with delta-syn replication (all set up according to 'official' guide - http://www.openldap.org/doc/admin24/replication.html#Delta-syncrepl I have master instance with logs 'shipped' to a client - it all works fine as long as connection is good. Getting ready to move into production I'm trying to emulate connectivity problems and here where I got problems. [snip] once I have server disconnected (I sumply restart slapd on master), the client not even tries to re-connect, the log below shows modificatin operation at 18:34:18 that went fine and 11 seconds later I restart master's ldap service (which became immediately available again): I am having the same trouble, but with ordinary syncrepl. As soon as the master is restarted, the slaves all quit their syncrepl threads, and never start again: Aug 12 08:58:00 ldapro04 slapd[9166]: do_syncrep2: rid 003 Can't contact LDAP server Aug 12 08:58:00 ldapro04 slapd[9166]: do_syncrepl: rid 003 quitting This is a serious barrier to deployment in a busy production environment with many slaves. Jul 28 18:34:29 newton slapd[20353]: do_syncrepl: rid 101 quitting I'm running openldap 2.3.43-12.el5_5.1 from standard CentOS 5.4 installation. I am running the same openldap as you, on CentOS 5.5. Do I get something wrong and slave not supposed to re-connect after master service restart or is this some kind of a problem that was fixed in later versions? I have exactly the same question. I don't think Alex and I are the only ones with this situation. slapd.conf on provider: === # slapd.conf generated by /usr/bin/conform include /etc/openldap/schema/core.schema include /etc/openldap/schema/cosine.schema include /etc/openldap/schema/inetorgperson.schema include /etc/openldap/schema/nis.schema include /etc/openldap/schema/local.schema loglevel stats sync allowbind_v2 pidfile /var/run/openldap/slapd.pid argsfile /var/run/openldap/slapd.args tool-threads 4 modulepath /usr/lib64/openldap # GLOBAL database definition access to dn.base= by * read access to dn.base=cn=Subschema by * read # ou=tree,ou=name database definition database bdb suffix ou=tree,ou=name rootdn cn=manager,ou=tree,ou=name rootpw root-password directory/var/lib/ldap/ou=tree,ou=name indexentryCSN eq indexentryUUID eq indexobjectClass eq indexuid eq indexusername eq cachesize100 idlcachesize 100 checkpoint 65536 240 idletimeout 300 writetimeout 9 limits dn.base=cn=syncrepl,ou=tree,ou=name size.soft=unlimited size.hard=unlimited time.soft=unlimited time.hard=unlimited access to dn.subtree=ou=tree,ou=name by dn=cn=syncrepl,ou=tree,ou=name read by peername.ip=227.137.34.172 read by peername.ip=209.146.228.56 read by peername.ip=147.107.14.11 read by peername.ip=127.0.0.1 read by * none break access to dn.subtree=ou=tree,ou=name attrs=userPassword by anonymous auth by * none break overlay syncprov checkpoint 1000 5 sessionlog 10 slapd.conf on consumer: === # slapd.conf generated by /usr/bin/conform include /etc/openldap/schema/core.schema include /etc/openldap/schema/cosine.schema include /etc/openldap/schema/inetorgperson.schema include /etc/openldap/schema/nis.schema include /etc/openldap/schema/local.schema loglevel stats sync allowbind_v2 pidfile /var/run/openldap/slapd.pid argsfile /var/run/openldap/slapd.args tool-threads 8 # GLOBAL database definition access to dn.base= by * read access to dn.base=cn=Subschema by * read # ou=tree,ou=name database definition database bdb suffix ou=tree,ou=name rootdn cn=manager,ou=tree,ou=name rootpw root-password directory/var/lib/ldap/ou=tree,ou=name indexentryCSN eq indexentryUUID eq indexobjectClass eq indexuid eq indexusername eq cachesize100 idlcachesize 100 checkpoint 65536 240 idletimeout 300 writetimeout 9 access to dn.subtree=ou=tree,ou=name by peername.ip=49.66.187.43 read by peername.ip=139.243.36.117 read by peername.ip=115.165.210.17 read by peername.ip=25.79.141.72%255.255.255.0 read by peername.ip=127.0.0.1 read by * none break access to
Re: syncrepl slaves all quit after master restart - not a single retry
syncrepl rid=003 provider=ldap://master:389 type=refreshAndPersist bindmethod=simple binddn=cn=syncrepl,ou=tree,ou=name credentials=syncrepl-password searchbase=ou=tree,ou=name There is no retry here. See slapd.conf(5) and the admin guide for indications about how syncrepl should be configured. p.
Re: syncrepl slaves all quit after master restart - not a single retry
Nick Urbanik wrote: Dear Alex, On 28/07/10 18:57 -0400, Alexander Ivanov wrote: Hello guys, I have a problem with delta-syn replication (all set up according to 'official' guide - http://www.openldap.org/doc/admin24/replication.html#Delta-syncrepl I'm running openldap 2.3.43-12.el5_5.1 from standard CentOS 5.4 installation. I am running the same openldap as you, on CentOS 5.5. It's generally a mistake to read the docs for a different version of the software than you're actually running. I have master instance with logs 'shipped' to a client - it all works fine as long as connection is good. Getting ready to move into production I'm trying to emulate connectivity problems and here where I got problems. [snip] once I have server disconnected (I sumply restart slapd on master), the client not even tries to re-connect, the log below shows modificatin operation at 18:34:18 that went fine and 11 seconds later I restart master's ldap service (which became immediately available again): I am having the same trouble, but with ordinary syncrepl. As soon as the master is restarted, the slaves all quit their syncrepl threads, and never start again: syncrepl rid=003 provider=ldap://master:389 type=refreshAndPersist bindmethod=simple binddn=cn=syncrepl,ou=tree,ou=name credentials=syncrepl-password searchbase=ou=tree,ou=name If you see any problems with these configuration files, please let me know, even if they do not relate to the problem of syncrepl terminating after master is restarted. You have no retry parameter in your syncrepl config, so naturally it does not retry. It always helps to actually Read The correct FM, slapd.conf(5) in your case. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Re: syncrepl slaves all quit after master restart - not a single retry
You have no retry parameter in your syncrepl config, so naturally it does not retry. It always helps to actually Read The correct FM, slapd.conf(5) in your case. I'd also note that slapd will issue syncrepl rid=003 searchbase=ou=tree,ou=name: no retry defined, using default if no retry is configured; one should at least wonder what that message means. I'd favor refusing to start if no retry is configured, since replication is not reliable without. p.
Re: syncrepl slaves all quit after master restart - not a single retry
masar...@aero.polimi.it wrote: You have no retry parameter in your syncrepl config, so naturally it does not retry. It always helps to actually Read The correct FM, slapd.conf(5) in your case. I'd also note that slapd will issue syncrepl rid=003 searchbase=ou=tree,ou=name: no retry defined, using default if no retry is configured; one should at least wonder what that message means. I'd favor refusing to start if no retry is configured, since replication is not reliable without. That message was added in 2.4, these guys are using 2.3. At this point I've grown tired of telling people you're using an obsolete release, you should upgrade. -- -- Howard Chu CTO, Symas Corp. http://www.symas.com Director, Highland Sun http://highlandsun.com/hyc/ Chief Architect, OpenLDAP http://www.openldap.org/project/
Re: syncrepl slaves all quit after master restart - not a single retry
Dear Masarati, On 13/08/10 02:44 +0200, masar...@aero.polimi.it wrote: You have no retry parameter in your syncrepl config, so naturally it does not retry. It always helps to actually Read The correct FM, slapd.conf(5) in your case. Bless you, thank you very much for that help. I'd also note that slapd will issue syncrepl rid=003 searchbase=ou=tree,ou=name: no retry defined, using default if no retry is configured; one should at least wonder what that message means. I'd favor refusing to start if no retry is configured, since replication is not reliable without. Yes, that makes sense. [r...@ldapro04.syd ~]# grep -P '\bretry' /var/log/ldap* [r...@ldapro04.syd ~]# No such error message seems to be present. -- Nick Urbanik http://nicku.org 808-71011 nick.urba...@optusnet.com.au GPG: 7FFA CDC7 5A77 0558 DC7A 790A 16DF EC5B BB9D 2C24 ID: BB9D2C24 I disclaim, therefore I am.
Re: syncrepl slaves all quit after master restart - not a single retry
Dear Howard, On 12/08/10 17:34 -0700, Howard Chu wrote: You have no retry parameter in your syncrepl config, so naturally it does not retry. It always helps to actually Read The correct FM, slapd.conf(5) in your case. Thank you very much indeed for your very helpful, prompt and accurate reply! I will happily buy you a beer or beverage of your choice if I see you at Linuxconf or elsewhere. -- Nick Urbanik http://nicku.org 808-71011 nick.urba...@optusnet.com.au GPG: 7FFA CDC7 5A77 0558 DC7A 790A 16DF EC5B BB9D 2C24 ID: BB9D2C24 I disclaim, therefore I am.