RE: Deadlock in openssl1.0.1e.

2014-01-15 Thread Tayade, Nilesh
 -Original Message-
 From: owner-openssl-us...@openssl.org [mailto:owner-openssl-
 us...@openssl.org] On Behalf Of Dr. Stephen Henson
 Sent: Wednesday, January 15, 2014 1:56 AM
 To: openssl-users@openssl.org
 Subject: Re: Deadlock in openssl1.0.1e.
 
 On Tue, Jan 14, 2014, Tayade, Nilesh wrote:
 
  Hi,
 
  I am not able to understand what???s going wrong, below is the latest
 stack trace I got. Thread 19 is killed by a watchdog, and it was waiting on a
 lock.
  Other threads do not show any lock being held.
 
 
 
 I'd suggest you modify your locking callback to store the location (file, line
 number) for the lock in question (18 which is CRYPTO_LOCK_RAND) when a
 lock is acquired successfully and clear it when the lock is released. Then 
 when
 you get a deadlock you can see the last instance that held (but did not
 release) the lock.
 
 You could try compiling with LOCK_DEBUG but that will create a *lot* of
 output.


Thanks for your suggestions. I did try capturing the debug information and 
below is how the state of the locks is. The array per lock stores information 
of the owner, file and lock_present i.e. lock is held or not.
I see that the lock # 18 is held but when thread 19 was waiting, it was trying 
for lock # 19. Not sure what's going on.

(gdb) p ssl_locks[18]
$22 = {
   thread_id = 140003803043584,
   file_name = md_rand.c, '\000' repeats 30 times,
   file_name_len = 9,
   line = 384,
   lock_present = 1
}
(gdb) p ssl_locks[19]
$23 = {
   thread_id = 140003803043584,
   file_name = md_rand.c, '\000' repeats 30 times,
   file_name_len = 9,
   line = 389,
   lock_present = 0
}

The mutex buffer looks like this. I am unable to interpret all the fields, any 
pointers on these?
(gdb) p mutex_buf[19]
$15 = {
   __data = {
 __lock = -1770245546,
 __count = 161970406,
 __owner = -794915378,
 __nusers = 890065223,
 __kind = 634752049,
 __spins = 239812167,
 __list = {
   __prev = 0x3cab28a258ddf6a1,
   __next = 0x1b78c957a2cf4cef
 }
   },
   __size = 
V2|\226\346x\247\t\316\215\236\320GQ\r51\214\325%GK\016\241\366\335X\242(\253\357LϢW\311x\033,
 

   __align = 695657599214563926
}

 
 Steve.
 --
 Dr Stephen N. Henson. OpenSSL project core developer.
 Commercial tech support now available see: http://www.openssl.org

--
Thanks,
Nilesh

 



Re: Deadlock in openssl1.0.1e.

2014-01-15 Thread Dr. Stephen Henson
On Wed, Jan 15, 2014, Tayade, Nilesh wrote:

 
 
 Thanks for your suggestions. I did try capturing the debug information and 
 below is how the state of the locks is. The array per lock stores information 
 of the owner, file and lock_present i.e. lock is held or not.
 I see that the lock # 18 is held but when thread 19 was waiting, it was 
 trying for lock # 19. Not sure what's going on.
 
 (gdb) p ssl_locks[18]
 $22 = {
thread_id = 140003803043584,
file_name = md_rand.c, '\000' repeats 30 times,
file_name_len = 9,
line = 384,
lock_present = 1
 }
 (gdb) p ssl_locks[19]
 $23 = {
thread_id = 140003803043584,
file_name = md_rand.c, '\000' repeats 30 times,
file_name_len = 9,
line = 389,
lock_present = 0
 }
 
 

By that I'm assuming it has acquired lock 18 and is attempting to get 19 but
failing because another thread already has it?

If so you need to know which thread has lock 19 and why it hasn't released it.
One way to do that is to store the details of each lock in a static array.

Steve.
--
Dr Stephen N. Henson. OpenSSL project core developer.
Commercial tech support now available see: http://www.openssl.org
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


RE: Deadlock in openssl1.0.1e.

2014-01-15 Thread Tayade, Nilesh
 -Original Message-
 From: owner-openssl-us...@openssl.org [mailto:owner-openssl-
 us...@openssl.org] On Behalf Of Dr. Stephen Henson
 Sent: Wednesday, January 15, 2014 6:30 PM
 To: openssl-users@openssl.org
 Subject: Re: Deadlock in openssl1.0.1e.
 
 On Wed, Jan 15, 2014, Tayade, Nilesh wrote:
 
 
 
  Thanks for your suggestions. I did try capturing the debug information and
 below is how the state of the locks is. The array per lock stores information 
 of
 the owner, file and lock_present i.e. lock is held or not.
  I see that the lock # 18 is held but when thread 19 was waiting, it was 
  trying
 for lock # 19. Not sure what's going on.
 
  (gdb) p ssl_locks[18]
  $22 = {
 thread_id = 140003803043584,
 file_name = md_rand.c, '\000' repeats 30 times,
 file_name_len = 9,
 line = 384,
 lock_present = 1
  }
  (gdb) p ssl_locks[19]
  $23 = {
 thread_id = 140003803043584,
 file_name = md_rand.c, '\000' repeats 30 times,
 file_name_len = 9,
 line = 389,
 lock_present = 0
  }
 
 
 
 By that I'm assuming it has acquired lock 18 and is attempting to get 19 but
 failing because another thread already has it?
 
 If so you need to know which thread has lock 19 and why it hasn't released it.
 One way to do that is to store the details of each lock in a static array.
 

Thanks. I have already captured the lock details in an array. Please see below. 
The leftmost digit is lock number index.
Lock 19 was taken by same thread and it was released (lock_present = 0). And 
this is causing confusion for me.

0 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
1 {thread_id = 140009934862112, file_name = err.c, '\000' repeats 34 
times, file_name_len = 5, line = 408, lock_present = 0},
2 {thread_id = 140003819828992,  file_name = ex_data.c, '\000' repeats 30 
times, file_name_len = 9, line = 511, lock_present = 0},
3 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
4 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
5 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
6 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
7 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
8 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
9 {thread_id = 140003803043584, file_name = rsa_eay.c, '\000' repeats30 
times, file_name_len = 9, line = 313, lock_present = 0},
10 {thread_id = 140003819828992, file_name = p_lib.c, '\000' repeats 32 
times, file_name_len = 7, line = 393, lock_present = 0},
11 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
12 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
13 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
14 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
15 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
16 {thread_id = 140009934862112,file_name = ssl_ciph.c, '\000' repeats 29 
times, file_name_len = 10, line = 481, lock_present = 0},
17 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
18 {thread_id = 140003803043584, file_name = md_rand.c, '\000' repeats 30 
times, file_name_len = 9, line = 384, lock_present = 1},
19 {thread_id = 140003803043584, file_name = md_rand.c, '\000' repeats 30 
times, file_name_len = 9, line = 389, lock_present = 0},
20 {thread_id = 140009934862112, file_name = mem_dbg.c, '\000' repeats 30 
times, file_name_len = 9, line = 282, lock_present = 0},
21 {thread_id = 140003819828992, file_name = bio_lib.c, '\000' repeats 30 
times, file_name_len = 9, line = 117, lock_present = 0},
22 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
23 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
24 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
25 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
26 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
27 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
28 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
29 {thread_id = 0, file_name = '\000' repeats 39 times, file_name_len = 0, 
line = 0, lock_present = 0},
30 {thread_id = 140009934862112,file_name

Re: Deadlock in openssl1.0.1e.

2014-01-15 Thread Dr. Stephen Henson
On Wed, Jan 15, 2014, Tayade, Nilesh wrote:

 
 Thanks. I have already captured the lock details in an array. Please see 
 below. The leftmost digit is lock number index.
 Lock 19 was taken by same thread and it was released (lock_present = 0). And 
 this is causing confusion for me.
 
 18 {thread_id = 140003803043584, file_name = md_rand.c, '\000' repeats 30 
 times, file_name_len = 9, line = 384, lock_present = 1},
 19 {thread_id = 140003803043584, file_name = md_rand.c, '\000' repeats 30 
 times, file_name_len = 9, line = 389, lock_present = 0},

Does it imply that thread has lock 18 but can't get 19 even though the lock
isn't present?

Does this deadlock happen almost immediately or some time later?

The way the PRNG works it's normal for lock 18 and 19 to be acquired and
released often sometimes simultaneously.

Steve.
--
Dr Stephen N. Henson. OpenSSL project core developer.
Commercial tech support now available see: http://www.openssl.org
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


Re: Deadlock in openssl1.0.1e.

2014-01-15 Thread Tayade, Nilesh
Yes. The issue is fairly reproducible and takes not more than 10-15mins to 
occur.

--
Thanks,
Nilesh
Sent via CanvasHD

Dr. Stephen Henson st...@openssl.org wrote:


On Wed, Jan 15, 2014, Tayade, Nilesh wrote:


 Thanks. I have already captured the lock details in an array. Please see 
 below. The leftmost digit is lock number index.
 Lock 19 was taken by same thread and it was released (lock_present = 0). And 
 this is causing confusion for me.

 18 {thread_id = 140003803043584, file_name = md_rand.c, '\000' repeats 30 
 times, file_name_len = 9, line = 384, lock_present = 1},
 19 {thread_id = 140003803043584, file_name = md_rand.c, '\000' repeats 30 
 times, file_name_len = 9, line = 389, lock_present = 0},

Does it imply that thread has lock 18 but can't get 19 even though the lock
isn't present?

Does this deadlock happen almost immediately or some time later?

The way the PRNG works it's normal for lock 18 and 19 to be acquired and
released often sometimes simultaneously.

Steve.
--
Dr Stephen N. Henson. OpenSSL project core developer.
Commercial tech support now available see: http://www.openssl.org
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org


Re: Deadlock in openssl1.0.1e.

2014-01-14 Thread Lee Dilkie
and what are all the other threads doing? some other thread must already
own this lock and is waiting on something else.

-lee

On 2014-01-10 13:11, Tayade, Nilesh wrote:
 Hi,

 I am facing a deadlock issue on openssl1.0.1e. Please see the stack below. 
 Could anyone please advise if this is a known bug? Is there any 
 workaround/fix? FIPS is disabled on my system.

 (gdb) bt full
 #0  __lll_lock_wait () at 
 ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136
 No locals.
 #1  0x00373da092c5 in _L_lock_868 () from /lib64/libpthread.so.0 No 
 symbol table info available.
 #2  0x00373da09197 in __pthread_mutex_lock (mutex=0x7418430) at 
 pthread_mutex_lock.c:61
 ignore1 = 128
 ignore2 = 121734192
 ignore3 = -512
 type = value optimized out
 id = 10428
 #3  0x009f3585 in locking_function (mode=9, n=18, file=0x3c79bdf93b 
 md_rand.c, line=228) at ssl_prf.c:653 No locals.
 #4  0x003c79b48f69 in ?? () from /usr/lib64/libcrypto.so.10 No symbol 
 table info available.
 #5  0x003c79b06c72 in ?? () from /usr/lib64/libcrypto.so.10 No symbol 
 table info available.
 #6  0x003c79b06aa0 in ?? () from /usr/lib64/libcrypto.so.10 No symbol 
 table info available.
 #7  0x003c79b07be2 in BN_BLINDING_create_param () from 
 /usr/lib64/libcrypto.so.10 No symbol table info available.
 #8  0x003c79b07d6a in BN_BLINDING_update () from 
 /usr/lib64/libcrypto.so.10 No symbol table info available.
 #9  0x003c79b07e56 in BN_BLINDING_convert_ex () from 
 /usr/lib64/libcrypto.so.10 No symbol table info available.

 --
 Thanks,
 Nilesh
 Sent via CanvasHD:??I???r?m
 (???Z+?K?+1???x??h???[?z?(???Z+???f?y??????f???h??)z{,??



RE: Deadlock in openssl1.0.1e.

2014-01-14 Thread Tayade, Nilesh
Hi,

I am not able to understand what’s going wrong, below is the latest stack trace 
I got. Thread 19 is killed by a watchdog, and it was waiting on a lock.
Other threads do not show any lock being held.

  21 Thread 23735  0x00373d2a6a8d in nanosleep () at 
../sysdeps/unix/syscall-template.S:82
  20 Thread 23752  0x00373d2da4f3 in select () at 
../sysdeps/unix/syscall-template.S:82
  19 Thread 23754  0x00373da0ed70 in __pause_nocancel () at 
../sysdeps/unix/syscall-template.S:82
  18 Thread 23724  0x00373da0e7ed in accept () at 
../sysdeps/unix/syscall-template.S:82
  17 Thread 23733  pthread_cond_wait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
  16 Thread 23704  pthread_cond_wait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
  15 Thread 23732  pthread_cond_wait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
  14 Thread 23705  pthread_cond_wait@@GLIBC_2.3.2 () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
  13 Thread 23753  0x00373da0e963 in recvfrom () at 
../sysdeps/unix/syscall-template.S:82
  12 Thread 23563  0x00373d2a6a8d in nanosleep () at 
../sysdeps/unix/syscall-template.S:82
  11 Thread 23734  0x00373da0e7ed in accept () at 
../sysdeps/unix/syscall-template.S:82
  10 Thread 23751  0x00373d2a6a8d in nanosleep () at 
../sysdeps/unix/syscall-template.S:82
  9 Thread 23756  0x00373d2c8c07 in sched_yield () at 
../sysdeps/unix/syscall-template.S:82
  8 Thread 23757  0x00373d2c8c07 in sched_yield () at 
../sysdeps/unix/syscall-template.S:82
  7 Thread 23755  0x00373d2c8c07 in sched_yield () at 
../sysdeps/unix/syscall-template.S:82
  6 Thread 23736  0x00373d2c8c07 in sched_yield () at 
../sysdeps/unix/syscall-template.S:82
  5 Thread 23759  0x00373d2c8c07 in sched_yield () at 
../sysdeps/unix/syscall-template.S:82
  4 Thread 23728  0x00373d2c8c07 in sched_yield () at 
../sysdeps/unix/syscall-template.S:82
  3 Thread 23731  0x00373d2c8c07 in sched_yield () at 
../sysdeps/unix/syscall-template.S:82
* 2 Thread 23703  0x00373da0e8ec in __libc_recv (fd=value optimized out, 
buf=value optimized out, n=value optimized out, flags=value optimized out)
at ../sysdeps/unix/sysv/linux/x86_64/recv.c:34
  1 Thread 23758  0x00373d2329a5 in raise (sig=6) at 
../nptl/sysdeps/unix/sysv/linux/raise.c:64

(gdb) thread 19
[Switching to thread 19 (Thread 23754)]#0  0x00373da0ed70 in 
__pause_nocancel () at ../sysdeps/unix/syscall-template.S:82
82  ../sysdeps/unix/syscall-template.S: No such file or directory.
in ../sysdeps/unix/syscall-template.S
(gdb) bt
#0  0x00373da0ed70 in __pause_nocancel () at 
../sysdeps/unix/syscall-template.S:82
#1  0x00373da08f0b in __pthread_mutex_lock_full (mutex=0x7418438) at 
pthread_mutex_lock.c:316
#2  0x009f3591 in locking_function (mode=9, n=19, file=0x3c79bdf93b 
Address 0x3c79bdf93b out of bounds, line=387) at ssl_prf.c:665
#3  0x003c79b49403 in ?? ()

--
Thanks,
Nilesh

From: owner-openssl-us...@openssl.org [mailto:owner-openssl-us...@openssl.org] 
On Behalf Of Lee Dilkie
Sent: Tuesday, January 14, 2014 7:25 PM
To: openssl-users@openssl.org
Subject: Re: Deadlock in openssl1.0.1e.

and what are all the other threads doing? some other thread must already own 
this lock and is waiting on something else.

-lee
On 2014-01-10 13:11, Tayade, Nilesh wrote:

Hi,



I am facing a deadlock issue on openssl1.0.1e. Please see the stack below. 
Could anyone please advise if this is a known bug? Is there any workaround/fix? 
FIPS is disabled on my system.



(gdb) bt full

#0  __lll_lock_wait () at 
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136

No locals.

#1  0x00373da092c5 in _L_lock_868 () from /lib64/libpthread.so.0 No symbol 
table info available.

#2  0x00373da09197 in __pthread_mutex_lock (mutex=0x7418430) at 
pthread_mutex_lock.c:61

ignore1 = 128

ignore2 = 121734192

ignore3 = -512

type = value optimized out

id = 10428

#3  0x009f3585 in locking_function (mode=9, n=18, file=0x3c79bdf93b 
md_rand.c, line=228) at ssl_prf.c:653 No locals.

#4  0x003c79b48f69 in ?? () from /usr/lib64/libcrypto.so.10 No symbol table 
info available.

#5  0x003c79b06c72 in ?? () from /usr/lib64/libcrypto.so.10 No symbol table 
info available.

#6  0x003c79b06aa0 in ?? () from /usr/lib64/libcrypto.so.10 No symbol table 
info available.

#7  0x003c79b07be2 in BN_BLINDING_create_param () from 
/usr/lib64/libcrypto.so.10 No symbol table info available.

#8  0x003c79b07d6a in BN_BLINDING_update () from /usr/lib64/libcrypto.so.10 
No symbol table info available.

#9  0x003c79b07e56 in BN_BLINDING_convert_ex () from 
/usr/lib64/libcrypto.so.10 No symbol table info available.



--

Thanks,

Nilesh

Sent via CanvasHD:��IϮ��r�m
(���Z+�K‑�+1���x
��h���[�z�(���Z+�
��f�y�‑�f���h

Re: Deadlock in openssl1.0.1e.

2014-01-14 Thread Dr. Stephen Henson
On Tue, Jan 14, 2014, Tayade, Nilesh wrote:

 Hi,
 
 I am not able to understand what???s going wrong, below is the latest stack 
 trace I got. Thread 19 is killed by a watchdog, and it was waiting on a lock.
 Other threads do not show any lock being held.
 
 

I'd suggest you modify your locking callback to store the location (file, line
number) for the lock in question (18 which is CRYPTO_LOCK_RAND) when a lock is
acquired successfully and clear it when the lock is released. Then when you
get a deadlock you can see the last instance that held (but did not release)
the lock.

You could try compiling with LOCK_DEBUG but that will create a *lot* of output.

Steve.
--
Dr Stephen N. Henson. OpenSSL project core developer.
Commercial tech support now available see: http://www.openssl.org
__
OpenSSL Project http://www.openssl.org
User Support Mailing Listopenssl-users@openssl.org
Automated List Manager   majord...@openssl.org