You mentioned that OpenSSL is holding a "snapshot" lock in rand_win.c. I couldn't find anything like that in that file. Can you specifically point me to the code that you are referring to? I would also like to get an opinion on possible workarounds that I can enforce to avoid the deadlock.
1. Can I remove the heap traversal routines Heap32First and Heap32Next? Will it badly affect the PRNG output later on? 2. Can I replace Heap32First and Heap32Next calls with any other sources of entropy? What if I make a call to CryptGenRandom again in place of the heap traversal routines? 3. Any other possible ways out? Thanks, Sandeep On Thu, Feb 23, 2012 at 10:08 PM, Jakob Bohm <jb-open...@wisemo.com> wrote: > From the evidence given, I would *almost* certainly characterize > this as a deadlock bug in ntdll.dll, the deepest, most trusted > user mode component of Windows! > > Specifically, nothing should allow regular user code such as > OpenSSL to hold onto NT internal critical sections while not > running inside NTDLL, and NTDLL should be designed not to > deadlock against itself. > > There is one other possibility though: > > The OpenSSL code in rand_win.c holds on to a "snapshot" lock > on some of the heap data while walking it. It may be doing > this in a way not permitted by the rules that are presumed > by the deadlock avoidance design of the speed critical heap > locking code. > > > On 2/23/2012 2:11 PM, sandeep kiran p wrote: > >> Hi, >> >> OpenSSL Version: 0.9.8o >> OS : Windows Server 2008 R2 SP1 >> >> I am seeing a deadlock in a windows application between two threads, one >> thread calling Heap32First from OpenSSL's RAND_poll and the other that >> allocates memory over the heap. >> >> Here is the relevant stack trace from both the threads involved in >> deadlock. >> >> Thread 523 >> ---------------- >> ntdll!ZwWaitForSingleObject+a >> ntdll!**RtlpWaitOnCriticalSection+e8 >> ntdll!RtlEnterCriticalSection+**d1 >> ntdll!RtlpAllocateHeap+18a6 >> ntdll!RtlAllocateHeap+16c >> ntdll!RtlpAllocateUserBlock+**145 >> ntdll!**RtlpLowFragHeapAllocFromContex**t+4e7 >> ntdll!RtlAllocateHeap+e4 >> ntdll!**RtlInitializeCriticalSectionEx**+d2 >> ntdll!**RtlpActivateLowFragmentationHe**ap+181 >> ntdll!**RtlpPerformHeapMaintenance+27 >> ntdll!RtlpAllocateHeap+1819 >> ntdll!RtlAllocateHeap+16c >> >> >> Thread 454 >> ----------------- >> ntdll!NtWaitForSingleObject+**0xa >> ntdll!**RtlpWaitOnCriticalSection+0xe8 >> ntdll!RtlEnterCriticalSection+**0xd1 >> ntdll!RtlLockHeap+0x3b >> ntdll!**RtlpQueryExtendedHeapInformati**on+0xf4 >> ntdll!RtlQueryHeapInformation+**0x3c >> ntdll!**RtlQueryProcessHeapInformation**+0x3ad >> ntdll!**RtlQueryProcessDebugInformatio**n+0x3b0 >> kernel32!Heap32First+0x71 >> >> WinDBG reports that thread 523 and 454 both hold locks and are waiting >> for each other locks thereby resulting in a deadlock. >> >> On searching, I have found a couple instances where such an issue has >> been reported with Heap32Next on Windows 7 but haven't found anything that >> helps me solve the problem. Most of the references I found conclude that >> this could be because of a possible bug in heap traversal APIs. If someone >> has faced a similar problem, can you guide me to possible workarounds by >> which I can avoid the deadlock? Can I remove the heap traversal routines >> and find some other sources of entropy? >> >> Thanks for your help. >> >> Regards >> Sandeep >> >> >> >> >> >> > Enjoy > > Jakob > -- > Jakob Bohm, CIO, Partner, WiseMo A/S. http://www.wisemo.com > Transformervej 29, 2730 Herlev, Denmark. Direct +45 31 13 16 10 > This public discussion message is non-binding and may contain errors. > WiseMo - Remote Service Management for PCs, Phones and Embedded > > ______________________________**______________________________**__________ > OpenSSL Project http://www.openssl.org > User Support Mailing List openssl-users@openssl.org > Automated List Manager majord...@openssl.org >