Re: [Freeipa-devel] slow response

Rich Megginson Wed, 01 Aug 2012 13:55:54 -0700

On 08/01/2012 02:49 PM, Loris Santamaria wrote:

El mié, 01-08-2012 a las 13:50 -0600, Rich Megginson escribió:

On 08/01/2012 01:34 PM, Loris Santamaria wrote:

El mié, 01-08-2012 a las 09:58 -0600, Rich Megginson escribió:

On 08/01/2012 09:20 AM, Loris Santamaria wrote:

El mié, 01-08-2012 a las 10:35 -0400, Simo Sorce escribió:

On Tue, 2012-07-31 at 08:49 -0700, Stephen Ingram wrote:

On Tue, Jul 31, 2012 at 1:39 AM, Petr Spacek<pspa...@redhat.com>   wrote:

On 07/31/2012 12:27 AM, John Dennis wrote:

What is taking so long with session bookkeeping? I don't know yet. I would
need more timing instrumentation. I will say when I looked at the
python-krb5
code (which we use to populate the ccache from the session and read back
to
store in the session) seemed to be remarkably inefficient. We also elected
to
use file based ccache rather than in-memory ccache (that means there is a
bit
of file-IO occurring).

A note regarding python-krbV:
I used python-krbV extensively in my thesis for KDC stress test. Python-krbV
can obtain several thousands of TGTs per second (even with ccache in a
file). AFAIK VFS calls are not done synchronously. But others parts of
python-krbV were left uncovered, so it can contain some surprises.


=== Wild speculation follows ===
1.5 second is incredibly long time, it sounds like some kind of timeout.
Krb5 libs have usual timeout = 1 second per request.

Are all KDCs in /etc/krb5.conf alive and reachable?

In this case, as I'm referring to the extreme slowness of the Web UI,
the KDC is on the same system (the ipa server) that is making the
request, correct?

Is SSSD running on problematic server?

Yes. Again, I'm guessing the problematic server is the IPA server itself.

Is proper KDC selected by SSSD KDC auto-locator plugin?
(See /var/lib/sss/pubconf/)

Yes, I checked that file and it is the IP address of the IPA server on
the same server. Perhaps should this be 127.0.0.1 instead?

I also have checked the resolv.conf file, and indeed the IP points to
the IPA server itself (same machine) as expected. Both forward and
reverse DNS work. I'm not really sure what else could be setup
incorrectly to cause any KDC slowness.

Due to the extreme UI slowness issue, I have not created any replicas
so this system is it. I'm not so sure I would be able to see the 1.5 s
delay if it weren't compounded by the overall slowness of the Web UI,
however, the KDC seems to perform well for other systems in the realm.
I'm certainly not taxing it with a huge load, but tickets seem to be
issued without delay.

Stephen,
another user sent me a wireshark trace for a similar performance issue.
So far I see a pause when doing the first leg of a SASL authentication.
This may well explain also your issue.

Hi, I experience the same delay in SASL authentication. The number I
posted on freeipa-users, show a 1-2 second delay with SASL
authentication:

# time ldapsearch -x uid=bdteg01662 dn
# extended LDIF
#
# LDAPv3
# base<dc=xxx,dc=gob,dc=ve>   (default) with scope subtree
# filter: uid=bdteg01662
# requesting: dn
#

# bdteg01662, users, accounts, xxx.gob.ve
dn: uid=bdteg01662,cn=users,cn=accounts,dc=xxx,dc=gob,dc=ve

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

real    0m0.006s
user    0m0.001s
sys     0m0.003s

# time ldapsearch -Y GSSAPI uid=bdteg01662 dn
SASL/GSSAPI authentication started
SASL username: ad...@xxx.gob.ve
SASL SSF: 56
SASL data security layer installed.
# extended LDIF
#
# LDAPv3
# base<dc=xxx,dc=gob,dc=ve>   (default) with scope subtree
# filter: uid=bdteg01662
# requesting: dn
#

# bdteg01662, users, accounts, xxx.gob.ve
dn: uid=bdteg01662,cn=users,cn=accounts,dc=xxx,dc=gob,dc=ve

# search result
search: 4
result: 0 Success

# numResponses: 2
# numEntries: 1

real    0m2.344s
user    0m0.007s
sys     0m0.005s

Can you post excerpts from your 389 access log showing the sequence of
operations for this connection, bind and search?

Here they are:

[01/Aug/2012:10:39:40 -041800] conn=33 fd=70 slot=70 connection from 
172.18.32.246 to 172.18.32.246
[01/Aug/2012:10:39:40 -041800] conn=33 op=0 BIND dn="" method=sasl version=3 
mech=GSSAPI
[01/Aug/2012:10:39:42 -041800] conn=33 op=0 RESULT err=14 tag=97 nentries=0 
etime=2, SASL bind in progress

Yep, this is it - this should not be taking 2 seconds.  I'd like to see
what internal operations are going on in this time.  Try this - follow
the directions at http://port389.org/wiki/FAQ#Troubleshooting but for
the access log, to turn on Heavy trace output debugging:

dn: cn=config
changetype: modify
replace: nsslapd-accesslog-level
nsslapd-accesslog-level: 4

Then turn the access log level back to the default (256) after
reproducing the problem.  We should then be able to see the sequence of
internal operations triggered by the BIND dn="" method=sasl version=3
mech=GSSAPI

Here they are, with the 2 second delay:

[01/Aug/2012:16:09:48 -041800] conn=Internal op=-1 SRCH 
base="cn=anonymous-limits,cn=etc,dc=XXXX,dc=XXXX,dc=ve" scope=0 
filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs="nsLookThroughLimit 
nsIDListScanLimit nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit 
nsIdleTimeout"
[01/Aug/2012:16:09:48 -041800] conn=Internal op=-1 RESULT err=0 tag=48 
nentries=1 etime=0
[01/Aug/2012:16:09:48 -041800] conn=Internal op=-1 SRCH 
base="cn=anonymous-limits,cn=etc,dc=XXXX,dc=XXXX,dc=ve" scope=0 
filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs="nsLookThroughLimit 
nsIDListScanLimit nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit 
nsIdleTimeout"
[01/Aug/2012:16:09:48 -041800] conn=Internal op=-1 RESULT err=0 tag=48 
nentries=1 etime=0
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH base="cn=config,cn=userRoot,cn=ldbm 
database,cn=plugins,cn=config" scope=1 filter="objectclass=vlvsearch" attrs=ALL
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 
nentries=0 etime=0
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH 
base="cn=anonymous-limits,cn=etc,dc=XXXX,dc=XXXX,dc=ve" scope=0 
filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs="nsLookThroughLimit 
nsIDListScanLimit nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit 
nsIdleTimeout"
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 
nentries=1 etime=0
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH base="cn=config,cn=userRoot,cn=ldbm 
database,cn=plugins,cn=config" scope=1 filter="objectclass=vlvsearch" attrs=ALL
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 
nentries=0 etime=0
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH 
base="cn=anonymous-limits,cn=etc,dc=XXXX,dc=XXXX,dc=ve" scope=0 
filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs="nsLookThroughLimit 
nsIDListScanLimit nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit 
nsIdleTimeout"
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 
nentries=1 etime=0
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH base="dc=XXXX,dc=XXXX,dc=ve" 
scope=2 filter="(krbPrincipalName=ad...@xxxx.xxxx.ve)" attrs=ALL
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 
nentries=1 etime=0
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH 
base="uid=admin,cn=users,cn=accounts,dc=XXXX,dc=XXXX,dc=ve" scope=0 
filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs="nsLookThroughLimit 
nsIDListScanLimit nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit 
nsIdleTimeout"
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 
nentries=1 etime=0
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 SRCH 
base="uid=admin,cn=users,cn=accounts,dc=XXXX,dc=XXXX,dc=ve" scope=0 
filter="(|(objectclass=*)(objectclass=ldapsubentry))" attrs=ALL
[01/Aug/2012:16:09:50 -041800] conn=Internal op=-1 RESULT err=0 tag=48 
nentries=1 etime=0


Just for kick I tried the search that appears in the log, and it returns
immediately:

ldapsearch -x -b "cn=anonymous-limits,cn=etc,dc=bt,dc=gob,dc=ve" 
"(|(objectclass=*)(objectclass=ldapsubentry))" nsLookThroughLimit nsIDListScanLimit 
nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit nsPagedSizeLimit 
nsIdleTimeout

# extended LDIF
#
# LDAPv3
# base<cn=anonymous-limits,cn=etc,dc=bt,dc=gob,dc=ve>  with scope
subtree
# filter: (|(objectclass=*)(objectclass=ldapsubentry))
# requesting: nsLookThroughLimit nsIDListScanLimit
nsPagedLookThroughLimit nsPagedIDListScanLimit nsSizeLimit nsTimeLimit
nsPagedSizeLimit nsIdleTimeout
#

# anonymous-limits, etc, bt.gob.ve
dn: cn=anonymous-limits,cn=etc,dc=bt,dc=gob,dc=ve
nsLookThroughLimit: 5000
nsSizeLimit: 5000

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1


Thanks - is there a ticket open for this issue?

[01/Aug/2012:10:39:42 -041800] conn=33 op=1 BIND dn="" method=sasl version=3 
mech=GSSAPI
[01/Aug/2012:10:39:42 -041800] conn=33 op=1 RESULT err=14 tag=97 nentries=0 
etime=0, SASL bind in progress
[01/Aug/2012:10:39:42 -041800] conn=33 op=2 BIND dn="" method=sasl version=3 
mech=GSSAPI
[01/Aug/2012:10:39:42 -041800] conn=33 op=2 RESULT err=0 tag=97 nentries=0 etime=0 
dn="uid=admin,cn=users,cn=accounts,dc=XXXX,dc=XXXX,dc=ve"
[01/Aug/2012:10:39:42 -041800] conn=33 op=3 SRCH base="dc=XXXX,dc=XXXX,dc=ve" scope=2 
filter="(uid=bdteg01662)" attrs=ALL
[01/Aug/2012:10:39:42 -041800] conn=33 op=3 RESULT err=0 tag=101 nentries=1 
etime=0

Can you test connecting to the ldap server using ldapsearch -Y GSSAPI
(you need a kerberos ticket) and telling me if you experience any
delay ?
If you could run a bunch of searches in a loop and take a wireshark
trace that may help analyzing the timings and seeing if there is a
correlation.

Simo.

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Re: [Freeipa-devel] slow response

Reply via email to