Bug#827135: [Pkg-openldap-devel] Bug#827135: slapd won't stop (shutdown) on multi-core system under stress

Ryan Tandy Fri, 28 Jul 2017 20:04:26 -0700

Control: tag -1 moreinfo unreproducible

Hi Zvika,


My apologies for taking so long to get back to you on this.

On Sun, Jun 12, 2016 at 07:37:36PM +0000, Zvika Ferentz wrote:

How to reproduced it:
-------------------------------
I guess that there are a few ways to reproduce it , I managed to easily reproduce it with 
two terminates - one producing "ldapsearch" stress and the other restarting 
slapd :
- Open two terminals.
- On terminal #1 i'm just manually running  "slapd restart" commands:
  # /etc/init.d/slapd status ; /etc/init.d/slapd restart
- On terminal #2 i'm running a infinite loops of simple "ldapsearch" (100
concurrent processes running loops of ldapsearch). Terminal #2 is trying to
simulate many concurrent read operations. see "more information" later for the
exact scripts that i used.

Incorrect behavior:
-------------------------
The "slapd restart" works a few times, and then the "stop" operation fails.
The stop continues to fail even if i stop all "stress" and terminate all
ldapsearch/connections  (CPU is 99% idle !)

Expected Behavior:
-------------------------
All slapd stop/restart operations complete successfully


More Information (optional - my exact scripts):
----------------------------------------------------------------
On terminal#2 i used a very simple script to generate a "read only" stress:
# cat > ldaploop.sh << EOF
#!/bin/sh
while true ; do  ldapsearch -x -Z ; done
EOF

# cat > manyloops.sh << "EOF"
#!/bin/sh
for i in `seq 1 100` ; do ( ./ldaploop.sh &) ; done
EOF

As previously mentioned, i ran the "manyloops.sh" to generate 100 running
processes where each one simply runs "ldapsearch" (locally).

Thanks a lot for the detailed steps to reproduce. I got access to a VMwith 16 CPUs where I could try this. It doesn't have a wheezy chroot anylonger, but I tried the jessie version (2.4.40+dfsg-1+deb8u3).

I'm afraid I have not been able to trigger any hangs, even using yourexact scripts and after restarting slapd many times.


I'm testing with the following, very simple, config:

include /etc/ldap/schema/core.schema
include /etc/ldap/schema/cosine.schema
include /etc/ldap/schema/nis.schema
include /etc/ldap/schema/inetorgperson.schema

tlscertificatefile ssl-cert-snakeoil.pem
tlscertificatekeyfile ssl-cert-snakeoil.key

moduleload back_mdb

database mdb
suffix dc=example,dc=com
directory db
index objectClass eq

and a database of 1000 entries. I tried both the hdb and mdb backends.

Do you still encounter this bug on jessie or stretch? Is there more toyour configuration than the simple config I posted, that might berelevant?

If you can still reproduce the bug, it would be great if you couldinstall slapd-dbg and libldap-2.4-2-dbg, cause slapd to hang, and thencapture a backtrace with gdb while it's stuck:


gdb -p $(pidof slapd)
thread apply all bt

thanks,
Ryan

Bug#827135: [Pkg-openldap-devel] Bug#827135: slapd won't stop (shutdown) on multi-core system under stress

Reply via email to