I'm still trying to figure out why my servers sometimes get into a state
where queries requesting the memberOf attribute take an exceedingly long
time to process, for example:

Feb 13 19:46:05 ldap-02 slapd[13564]: conn=297643 fd=104 ACCEPT from 
PATH=/var/symas/run/ldapi (PATH=/var/symas/run/ldapi)
Feb 13 19:46:05 ldap-02 slapd[13564]: conn=297643 op=0 BIND 
dn="cn=ldaproot,dc=cpp,dc=edu" method=128
Feb 13 19:46:05 ldap-02 slapd[13564]: conn=297643 op=0 BIND 
dn="cn=ldaproot,dc=cpp,dc=edu" mech=SIMPLE bind_ssf=0 ssf=71
Feb 13 19:46:05 ldap-02 slapd[13564]: conn=297643 op=0 RESULT tag=97 err=0 
qtime=0.000006 etime=0.000050 text=
Feb 13 19:46:05 ldap-02 slapd[13564]: conn=297643 op=1 SRCH 
base="dc=cpp,dc=edu" scope=2 deref=0 filter="(uid=henson)"
Feb 13 19:46:05 ldap-02 slapd[13564]: conn=297643 op=1 SRCH attr=memberOf
Feb 13 19:46:42 ldap-02 slapd[13564]: conn=297643 op=1 SEARCH RESULT tag=101 
err=0 qtime=0.000012 etime=36.955710 nentries=1 text=

How is the qtime calculated? It is nice and short, despite the wall clock
reading over 30 seconds :(.

Usually I have to reboot the server completely to clear this up, but this
time I just had to restart it, and then the queries were lickity split
again:

Feb 13 19:55:01 ldap-02 slapd[89655]: conn=1556 fd=40 ACCEPT from 
PATH=/var/symas/run/ldapi (PATH=/var/symas/run/ldapi)
Feb 13 19:55:01 ldap-02 slapd[89655]: conn=1556 op=0 BIND 
dn="cn=ldaproot,dc=cpp,dc=edu" method=128
Feb 13 19:55:01 ldap-02 slapd[89655]: conn=1556 op=0 BIND 
dn="cn=ldaproot,dc=cpp,dc=edu" mech=SIMPLE bind_ssf=0 ssf=71
Feb 13 19:55:01 ldap-02 slapd[89655]: conn=1556 op=0 RESULT tag=97 err=0 
qtime=0.000009 etime=0.000088 text=
Feb 13 19:55:01 ldap-02 slapd[89655]: conn=1556 op=1 SRCH base="dc=cpp,dc=edu" 
scope=2 deref=0 filter="(uid=henson)"
Feb 13 19:55:01 ldap-02 slapd[89655]: conn=1556 op=1 SRCH attr=memberOf
Feb 13 19:55:01 ldap-02 slapd[89655]: conn=1556 op=1 SEARCH RESULT tag=101 
err=0 qtime=0.000013 etime=0.213149 nentries=1 text=

Over 30 seconds elapsed time to .2 seconds? I'd like to see the .2 all
the time :).

When the server gets like this, there's a very high read IO load, 200-300Mb/s,
compared to generally less than 1Mb/s when things are working right.

It almost seems like it's doing a full disk scan searching for members
every time.

Any suggestions on what to dig into?

Thanks...

Reply via email to