Control: tag -1 moreinfo unreproducible

Hi Zvika,

My apologies for taking so long to get back to you on this.

On Sun, Jun 12, 2016 at 07:37:36PM +0000, Zvika Ferentz wrote:
How to reproduced it:
-------------------------------
I guess that there are a few ways to reproduce it , I managed to easily reproduce it with 
two terminates - one producing "ldapsearch" stress and the other restarting 
slapd :
- Open two terminals.
- On terminal #1 i'm just manually running  "slapd restart" commands:
  # /etc/init.d/slapd status ; /etc/init.d/slapd restart
- On terminal #2 i'm running a infinite loops of simple "ldapsearch" (100
concurrent processes running loops of ldapsearch). Terminal #2 is trying to
simulate many concurrent read operations. see "more information" later for the
exact scripts that i used.

Incorrect behavior:
-------------------------
The "slapd restart" works a few times, and then the "stop" operation fails.
The stop continues to fail even if i stop all "stress" and terminate all
ldapsearch/connections  (CPU is 99% idle !)

Expected Behavior:
-------------------------
All slapd stop/restart operations complete successfully


More Information (optional - my exact scripts):
----------------------------------------------------------------
On terminal#2 i used a very simple script to generate a "read only" stress:
# cat > ldaploop.sh << EOF
#!/bin/sh
while true ; do  ldapsearch -x -Z ; done
EOF

# cat > manyloops.sh << "EOF"
#!/bin/sh
for i in `seq 1 100` ; do ( ./ldaploop.sh &) ; done
EOF

As previously mentioned, i ran the "manyloops.sh" to generate 100 running
processes where each one simply runs "ldapsearch" (locally).

Thanks a lot for the detailed steps to reproduce. I got access to a VM with 16 CPUs where I could try this. It doesn't have a wheezy chroot any longer, but I tried the jessie version (2.4.40+dfsg-1+deb8u3).

I'm afraid I have not been able to trigger any hangs, even using your exact scripts and after restarting slapd many times.

I'm testing with the following, very simple, config:

include /etc/ldap/schema/core.schema
include /etc/ldap/schema/cosine.schema
include /etc/ldap/schema/nis.schema
include /etc/ldap/schema/inetorgperson.schema

tlscertificatefile ssl-cert-snakeoil.pem
tlscertificatekeyfile ssl-cert-snakeoil.key

moduleload back_mdb

database mdb
suffix dc=example,dc=com
directory db
index objectClass eq

and a database of 1000 entries. I tried both the hdb and mdb backends.

Do you still encounter this bug on jessie or stretch? Is there more to your configuration than the simple config I posted, that might be relevant?

If you can still reproduce the bug, it would be great if you could install slapd-dbg and libldap-2.4-2-dbg, cause slapd to hang, and then capture a backtrace with gdb while it's stuck:

gdb -p $(pidof slapd)
thread apply all bt

thanks,
Ryan

Reply via email to