From the evidence given, I would *almost* certainly characterize
this as a deadlock bug in ntdll.dll, the deepest, most trusted
user mode component of Windows!

Specifically, nothing should allow regular user code such as
OpenSSL to hold onto NT internal critical sections while not
running inside NTDLL, and NTDLL should be designed not to
deadlock against itself.

There is one other possibility though:

The OpenSSL code in rand_win.c holds on to a "snapshot" lock
on some of the heap data while walking it.  It may be doing
this in a way not permitted by the rules that are presumed
by the deadlock avoidance design of the speed critical heap
locking code.

On 2/23/2012 2:11 PM, sandeep kiran p wrote:
Hi,

OpenSSL Version: 0.9.8o
OS : Windows Server 2008 R2 SP1

I am seeing a deadlock in a windows application between two threads, one thread calling Heap32First from OpenSSL's RAND_poll and the other that allocates memory over the heap.

Here is the relevant stack trace from both the threads involved in deadlock.

Thread 523
----------------
ntdll!ZwWaitForSingleObject+a
ntdll!RtlpWaitOnCriticalSection+e8
ntdll!RtlEnterCriticalSection+d1
ntdll!RtlpAllocateHeap+18a6
ntdll!RtlAllocateHeap+16c
ntdll!RtlpAllocateUserBlock+145
ntdll!RtlpLowFragHeapAllocFromContext+4e7
ntdll!RtlAllocateHeap+e4
ntdll!RtlInitializeCriticalSectionEx+d2
ntdll!RtlpActivateLowFragmentationHeap+181
ntdll!RtlpPerformHeapMaintenance+27
ntdll!RtlpAllocateHeap+1819
ntdll!RtlAllocateHeap+16c


Thread 454
-----------------
ntdll!NtWaitForSingleObject+0xa
ntdll!RtlpWaitOnCriticalSection+0xe8
ntdll!RtlEnterCriticalSection+0xd1
ntdll!RtlLockHeap+0x3b
ntdll!RtlpQueryExtendedHeapInformation+0xf4
ntdll!RtlQueryHeapInformation+0x3c
ntdll!RtlQueryProcessHeapInformation+0x3ad
ntdll!RtlQueryProcessDebugInformation+0x3b0
kernel32!Heap32First+0x71

WinDBG reports that thread 523 and 454 both hold locks and are waiting for each other locks thereby resulting in a deadlock.

On searching, I have found a couple instances where such an issue has been reported with Heap32Next on Windows 7 but haven't found anything that helps me solve the problem. Most of the references I found conclude that this could be because of a possible bug in heap traversal APIs. If someone has faced a similar problem, can you guide me to possible workarounds by which I can avoid the deadlock? Can I remove the heap traversal routines and find some other sources of entropy?

Thanks for your help.

Regards
Sandeep






Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  http://www.wisemo.com
Transformervej 29, 2730 Herlev, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to