Re: [389-users] Fwd: 389 v1.2.9.8 freeze/deadlock

2011-09-01 Thread Rich Megginson
On 09/01/2011 09:02 AM, Andrey Ivanov wrote:
> The full dgb trace with debug symbols is in the attached file. Hope it 
> helps
Thanks.  Yes, this is the same thing I am able to reproduce.

Working on a fix now.
>
>
>>> The same test in 1.2.8.3 is ok, the important information being that
>>> it is also a paged search. here is the log for the same search for
>>> 1.2.8.3 (i'm in the process of rolling back to that version):
>>>
>>> [01/Sep/2011:16:19:39 +0200] conn=5 op=2 fd=128 closed - U1
>>> [01/Sep/2011:16:19:41 +0200] conn=6 fd=128 slot=128 connection from
>>> 129.104.31.63 to 129.104.69.49
>>> [01/Sep/2011:16:19:41 +0200] conn=6 op=0 BIND dn="" method=128 version=3
>>> [01/Sep/2011:16:19:41 +0200] conn=6 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0.017000 dn=""
>>> [01/Sep/2011:16:19:41 +0200] conn=6 op=1 SRCH
>>> base="ou=etudiants,ou=utilisateurs,dc=id,dc=polytechnique,dc=edu"
>>> scope=2 filter="(&(mail=*)(|(mail=le tallec*)(cn=le tallec*)(sn=le
>>> tallec*)(givenName=le tallec*)(displayName=le tallec*)))" attrs="cn cn
>>> mail roleOccupant display-name displayName sn sn co o o givenName
>>> legacyexchangedn objectClass uid mailnickname title company
>>> physicalDeliveryOfficeName telephoneNumber"
>>> [01/Sep/2011:16:19:41 +0200] conn=6 op=1 SORT cn (1)
>>> [01/Sep/2011:16:19:41 +0200] conn=6 op=1 RESULT err=0 tag=101
>>> nentries=0 etime=0.021000 notes=P
>>> [01/Sep/2011:16:19:41 +0200] conn=6 op=2 UNBIND
>>> [01/Sep/2011:16:19:41 +0200] conn=6 op=2 fd=128 closed - U1
>>>
>>>
>>>
>>> How do i compile the server with debug symbols? This would be
>>> sufficient or not:
>>> export CFLAGS="-g"
>>> export CXXFLAGS="-g"
>> Yes.  I usually do CFLAGS="-g" CXXFLAGS="-g" configure --enable-debug 
>> other configure args 
>>
>> You can install the debuginfo package:
>> debuginfo-install 389-ds-base

--
389 users mailing list
389-us...@lists.fedoraproject.org
https://admin.fedoraproject.org/mailman/listinfo/389-users


Re: [389-users] Fwd: 389 v1.2.9.8 freeze/deadlock

2011-09-01 Thread Rich Megginson
On 09/01/2011 08:28 AM, Andrey Ivanov wrote:
> Hi Rich,
>
> The same test in 1.2.8.3 is ok, the important information being that
> it is also a paged search. here is the log for the same search for
> 1.2.8.3 (i'm in the process of rolling back to that version):
>
> [01/Sep/2011:16:19:39 +0200] conn=5 op=2 fd=128 closed - U1
> [01/Sep/2011:16:19:41 +0200] conn=6 fd=128 slot=128 connection from
> 129.104.31.63 to 129.104.69.49
> [01/Sep/2011:16:19:41 +0200] conn=6 op=0 BIND dn="" method=128 version=3
> [01/Sep/2011:16:19:41 +0200] conn=6 op=0 RESULT err=0 tag=97
> nentries=0 etime=0.017000 dn=""
> [01/Sep/2011:16:19:41 +0200] conn=6 op=1 SRCH
> base="ou=etudiants,ou=utilisateurs,dc=id,dc=polytechnique,dc=edu"
> scope=2 filter="(&(mail=*)(|(mail=le tallec*)(cn=le tallec*)(sn=le
> tallec*)(givenName=le tallec*)(displayName=le tallec*)))" attrs="cn cn
> mail roleOccupant display-name displayName sn sn co o o givenName
> legacyexchangedn objectClass uid mailnickname title company
> physicalDeliveryOfficeName telephoneNumber"
> [01/Sep/2011:16:19:41 +0200] conn=6 op=1 SORT cn (1)
> [01/Sep/2011:16:19:41 +0200] conn=6 op=1 RESULT err=0 tag=101
> nentries=0 etime=0.021000 notes=P
> [01/Sep/2011:16:19:41 +0200] conn=6 op=2 UNBIND
> [01/Sep/2011:16:19:41 +0200] conn=6 op=2 fd=128 closed - U1
I am able to reproduce - https://bugzilla.redhat.com/show_bug.cgi?id=735121
>
>
> How do i compile the server with debug symbols? This would be
> sufficient or not:
> export CFLAGS="-g"
> export CXXFLAGS="-g"
>
> ?
>
>
> @+
>
> 2011/9/1 Rich Megginson:
>> On 09/01/2011 08:08 AM, Andrey Ivanov wrote:
>>> Hi,
>>>
>>> i've tried to install the 1.2.9.8 testing version in our production
>>> environment but there is a regular freeze/deadlock after a particular
>>> search.
>>>
>>> It is a search sent by outlook 2003 (you type the name of the person
>>> and then click "Check the name" button that generates an LDAP
>>> request). The person does not exist in the given subtree, here is the
>>> corresponding connection in the logs :
>>>
>>> [01/Sep/2011:13:42:34 +0200] conn=938 fd=129 slot=129 connection from
>>> x.x.x.x to y.y.y.y
>>> [01/Sep/2011:13:42:34 +0200] conn=938 op=0 BIND dn="" method=128 version=3
>>> [01/Sep/2011:13:42:34 +0200] conn=938 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0.00 dn=""
>>> [01/Sep/2011:13:42:34 +0200] conn=938 op=1 SRCH
>>> base="ou=etudiants,ou=utilisateurs,dc=id,dc=polytechnique,dc=edu"
>>> scope=2 filter="(&(mail=*)(|(mail=le tallec*)(cn=le tallec*)(sn=le
>>> tallec*)(givenName=le tallec*)(displayName=le tallec*)))" attrs="cn cn
>>> mail roleOccupant display-name displayName sn sn co o o givenName
>>> legacyexchangedn objectClass uid mailnickname title company
>>> physicalDeliveryOfficeName telephoneNumber"
>>> [01/Sep/2011:13:42:34 +0200] conn=938 op=1 SORT cn (1)
>>> 
>>>
>>>
>>> The problem is reproducible each time, here is the interesting part of
>>> the gdb trace :
>>>
>>> Thread 42 (Thread 0x42201940 (LWP 25005)):
>>> #0  0x0038644cd722 in select () from /lib64/libc.so.6
>>> No symbol table info available.
>>> #1  0x2b8ffb1bf959 in DS_Sleep () from
>>> /Local/dirsrv/lib/dirsrv/libslapd.so.0
>>> No symbol table info available.
>>> #2  0x2b900104e51e in deadlock_threadmain () from
>>> /Local/dirsrv/lib/dirsrv/plugins/libback-ldbm.so
>>> No symbol table info available.
>>> #3  0x0038670284ad in ?? () from /usr/lib64/libnspr4.so
>>> No symbol table info available.
>>> #4  0x00386500673d in start_thread () from /lib64/libpthread.so.0
>>> No symbol table info available.
>>> #5  0x0038644d44bd in clone () from /lib64/libc.so.6
>>> No symbol table info available.
>>> ...
>> This is the database housekeeping thread that checks for database deadlocks.
>>   This is normal.
>>> Thread 24 (Thread 0x4d613940 (LWP 25023)):
>>> #0  0x00386500d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
>>> No symbol table info available.
>>> #1  0x003865008e50 in _L_lock_1233 () from /lib64/libpthread.so.0
>>> No symbol table info available.
>>> #2  0x003865008dd3 in pthread_mutex_lock () from
>>> /lib64/libpthread.so.0
>>> No symbol table info available.
>>> #3  0x003867022ec9 in PR_Lock () from /usr/lib64/libnspr4.so
>>> No symbol table info available.
>>> #4  0x2b8ffb18b308 in slapi_pblock_get () from
>>> /Local/dirsrv/lib/dirsrv/libslapd.so.0
>>> No symbol table info available.
>>> #5  0x2b88ac54 in DS_LASIpGetter () from
>>> /Local/dirsrv/lib/dirsrv/plugins/libacl-plugin.so
>>> No symbol table info available.
>>> #6  0x2b90001bfb08 in ACL_GetAttribute () from
>>> /Local/dirsrv/lib/dirsrv/libns-dshttpd.so.0
>>> No symbol table info available.
>>> #7  0x2b90001be979 in LASIpEval () from
>>> /Local/dirsrv/lib/dirsrv/libns-dshttpd.so.0
>>> No symbol table info available.
>>> #8  0x2b90001c0c30 in ACLEvalAce(NSErr_s*, ACLEvalHandle*,
>>> ACLExprHandle*, unsigned long*, PListStruct_s**, PListStruct_s*) ()
>>> from /Local/dirsrv/lib/dirsrv/libns-dshttp

Re: [389-users] Fwd: 389 v1.2.9.8 freeze/deadlock

2011-09-01 Thread Rich Megginson
On 09/01/2011 08:28 AM, Andrey Ivanov wrote:
> Hi Rich,
>
> The same test in 1.2.8.3 is ok, the important information being that
> it is also a paged search. here is the log for the same search for
> 1.2.8.3 (i'm in the process of rolling back to that version):
>
> [01/Sep/2011:16:19:39 +0200] conn=5 op=2 fd=128 closed - U1
> [01/Sep/2011:16:19:41 +0200] conn=6 fd=128 slot=128 connection from
> 129.104.31.63 to 129.104.69.49
> [01/Sep/2011:16:19:41 +0200] conn=6 op=0 BIND dn="" method=128 version=3
> [01/Sep/2011:16:19:41 +0200] conn=6 op=0 RESULT err=0 tag=97
> nentries=0 etime=0.017000 dn=""
> [01/Sep/2011:16:19:41 +0200] conn=6 op=1 SRCH
> base="ou=etudiants,ou=utilisateurs,dc=id,dc=polytechnique,dc=edu"
> scope=2 filter="(&(mail=*)(|(mail=le tallec*)(cn=le tallec*)(sn=le
> tallec*)(givenName=le tallec*)(displayName=le tallec*)))" attrs="cn cn
> mail roleOccupant display-name displayName sn sn co o o givenName
> legacyexchangedn objectClass uid mailnickname title company
> physicalDeliveryOfficeName telephoneNumber"
> [01/Sep/2011:16:19:41 +0200] conn=6 op=1 SORT cn (1)
> [01/Sep/2011:16:19:41 +0200] conn=6 op=1 RESULT err=0 tag=101
> nentries=0 etime=0.021000 notes=P
> [01/Sep/2011:16:19:41 +0200] conn=6 op=2 UNBIND
> [01/Sep/2011:16:19:41 +0200] conn=6 op=2 fd=128 closed - U1
>
>
>
> How do i compile the server with debug symbols? This would be
> sufficient or not:
> export CFLAGS="-g"
> export CXXFLAGS="-g"
Yes.  I usually do CFLAGS="-g" CXXFLAGS="-g" configure --enable-debug 
 other configure args 

You can install the debuginfo package:
debuginfo-install 389-ds-base

That will give you the full debugging symbols in gdb.

> ?
>
>
> @+
>
> 2011/9/1 Rich Megginson:
>> On 09/01/2011 08:08 AM, Andrey Ivanov wrote:
>>> Hi,
>>>
>>> i've tried to install the 1.2.9.8 testing version in our production
>>> environment but there is a regular freeze/deadlock after a particular
>>> search.
>>>
>>> It is a search sent by outlook 2003 (you type the name of the person
>>> and then click "Check the name" button that generates an LDAP
>>> request). The person does not exist in the given subtree, here is the
>>> corresponding connection in the logs :
>>>
>>> [01/Sep/2011:13:42:34 +0200] conn=938 fd=129 slot=129 connection from
>>> x.x.x.x to y.y.y.y
>>> [01/Sep/2011:13:42:34 +0200] conn=938 op=0 BIND dn="" method=128 version=3
>>> [01/Sep/2011:13:42:34 +0200] conn=938 op=0 RESULT err=0 tag=97
>>> nentries=0 etime=0.00 dn=""
>>> [01/Sep/2011:13:42:34 +0200] conn=938 op=1 SRCH
>>> base="ou=etudiants,ou=utilisateurs,dc=id,dc=polytechnique,dc=edu"
>>> scope=2 filter="(&(mail=*)(|(mail=le tallec*)(cn=le tallec*)(sn=le
>>> tallec*)(givenName=le tallec*)(displayName=le tallec*)))" attrs="cn cn
>>> mail roleOccupant display-name displayName sn sn co o o givenName
>>> legacyexchangedn objectClass uid mailnickname title company
>>> physicalDeliveryOfficeName telephoneNumber"
>>> [01/Sep/2011:13:42:34 +0200] conn=938 op=1 SORT cn (1)
>>> 
>>>
>>>
>>> The problem is reproducible each time, here is the interesting part of
>>> the gdb trace :
>>>
>>> Thread 42 (Thread 0x42201940 (LWP 25005)):
>>> #0  0x0038644cd722 in select () from /lib64/libc.so.6
>>> No symbol table info available.
>>> #1  0x2b8ffb1bf959 in DS_Sleep () from
>>> /Local/dirsrv/lib/dirsrv/libslapd.so.0
>>> No symbol table info available.
>>> #2  0x2b900104e51e in deadlock_threadmain () from
>>> /Local/dirsrv/lib/dirsrv/plugins/libback-ldbm.so
>>> No symbol table info available.
>>> #3  0x0038670284ad in ?? () from /usr/lib64/libnspr4.so
>>> No symbol table info available.
>>> #4  0x00386500673d in start_thread () from /lib64/libpthread.so.0
>>> No symbol table info available.
>>> #5  0x0038644d44bd in clone () from /lib64/libc.so.6
>>> No symbol table info available.
>>> ...
>> This is the database housekeeping thread that checks for database deadlocks.
>>   This is normal.
>>> Thread 24 (Thread 0x4d613940 (LWP 25023)):
>>> #0  0x00386500d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
>>> No symbol table info available.
>>> #1  0x003865008e50 in _L_lock_1233 () from /lib64/libpthread.so.0
>>> No symbol table info available.
>>> #2  0x003865008dd3 in pthread_mutex_lock () from
>>> /lib64/libpthread.so.0
>>> No symbol table info available.
>>> #3  0x003867022ec9 in PR_Lock () from /usr/lib64/libnspr4.so
>>> No symbol table info available.
>>> #4  0x2b8ffb18b308 in slapi_pblock_get () from
>>> /Local/dirsrv/lib/dirsrv/libslapd.so.0
>>> No symbol table info available.
>>> #5  0x2b88ac54 in DS_LASIpGetter () from
>>> /Local/dirsrv/lib/dirsrv/plugins/libacl-plugin.so
>>> No symbol table info available.
>>> #6  0x2b90001bfb08 in ACL_GetAttribute () from
>>> /Local/dirsrv/lib/dirsrv/libns-dshttpd.so.0
>>> No symbol table info available.
>>> #7  0x2b90001be979 in LASIpEval () from
>>> /Local/dirsrv/lib/dirsrv/libns-dshttpd.so.0
>>> No symbol table info available.
>>> #8  0x2b90001c0c30 in ACL