From the evidence given, I would *almost* certainly characterize
this as a deadlock bug in ntdll.dll, the deepest, most trusted
user mode component of Windows!
Specifically, nothing should allow regular user code such as
OpenSSL to hold onto NT internal critical sections while not
running inside NTDLL, and NTDLL should be designed not to
deadlock against itself.
There is one other possibility though:
The OpenSSL code in rand_win.c holds on to a "snapshot" lock
on some of the heap data while walking it. It may be doing
this in a way not permitted by the rules that are presumed
by the deadlock avoidance design of the speed critical heap
locking code.
On 2/23/2012 2:11 PM, sandeep kiran p wrote:
Hi,
OpenSSL Version: 0.9.8o
OS : Windows Server 2008 R2 SP1
I am seeing a deadlock in a windows application between two threads,
one thread calling Heap32First from OpenSSL's RAND_poll and the other
that allocates memory over the heap.
Here is the relevant stack trace from both the threads involved in
deadlock.
Thread 523
----------------
ntdll!ZwWaitForSingleObject+a
ntdll!RtlpWaitOnCriticalSection+e8
ntdll!RtlEnterCriticalSection+d1
ntdll!RtlpAllocateHeap+18a6
ntdll!RtlAllocateHeap+16c
ntdll!RtlpAllocateUserBlock+145
ntdll!RtlpLowFragHeapAllocFromContext+4e7
ntdll!RtlAllocateHeap+e4
ntdll!RtlInitializeCriticalSectionEx+d2
ntdll!RtlpActivateLowFragmentationHeap+181
ntdll!RtlpPerformHeapMaintenance+27
ntdll!RtlpAllocateHeap+1819
ntdll!RtlAllocateHeap+16c
Thread 454
-----------------
ntdll!NtWaitForSingleObject+0xa
ntdll!RtlpWaitOnCriticalSection+0xe8
ntdll!RtlEnterCriticalSection+0xd1
ntdll!RtlLockHeap+0x3b
ntdll!RtlpQueryExtendedHeapInformation+0xf4
ntdll!RtlQueryHeapInformation+0x3c
ntdll!RtlQueryProcessHeapInformation+0x3ad
ntdll!RtlQueryProcessDebugInformation+0x3b0
kernel32!Heap32First+0x71
WinDBG reports that thread 523 and 454 both hold locks and are waiting
for each other locks thereby resulting in a deadlock.
On searching, I have found a couple instances where such an issue has
been reported with Heap32Next on Windows 7 but haven't found anything
that helps me solve the problem. Most of the references I found
conclude that this could be because of a possible bug in heap
traversal APIs. If someone has faced a similar problem, can you guide
me to possible workarounds by which I can avoid the deadlock? Can I
remove the heap traversal routines and find some other sources of entropy?
Thanks for your help.
Regards
Sandeep
Enjoy
Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. http://www.wisemo.com
Transformervej 29, 2730 Herlev, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
______________________________________________________________________
OpenSSL Project http://www.openssl.org
User Support Mailing List openssl-users@openssl.org
Automated List Manager majord...@openssl.org