Re: Deadlock in RAND_poll's Heap32First call

sandeep kiran p Sat, 25 Feb 2012 06:31:52 -0800

MSDN says

" To enumerate the heap or module states for all processes, specify
TH32CS_SNAPALL and set *th32ProcessID* to zero. "


So it presumably does the heap and module walk for all processes and not
only for the current process.

Do you think  *CreateToolhelp32Snapshot's*  lock on the read-only snapshot
could be a possible culprit?

I am now thinking about removing the calls to Heap32First and Heap32Next in
rand_win.c and look for alternate sources of entropy.

Thanks for you help.

Regards
Sandeep

On Sat, Feb 25, 2012 at 2:38 AM, Jakob Bohm <jb-open...@wisemo.com> wrote:

> On 2/24/2012 2:14 PM, sandeep kiran p wrote:
>
>> You mentioned that OpenSSL is holding a "snapshot" lock in rand_win.c. I
>> couldn't find anything like that in that file. Can you specifically point
>> me to the code that you are referring to? I would also like to get an
>> opinion on possible workarounds that I can enforce to avoid the deadlock.
>>
>>  In OpenSSL 1.0.0 it is line 486 which says
>
>         module_next && (handle = snap(TH32CS_SNAPALL,0))
>
> where snap is a pointer to KERNEL32.**CreateToolhelp32Snapshot()
>
>
>  1. Can I remove the heap traversal routines Heap32First and Heap32Next?
>> Will it badly affect the PRNG output later on?
>>
> It depends how good the other sources of random numbers are,
> more below.
>
>
>> 2. Can I replace Heap32First and Heap32Next calls with any other sources
>> of entropy? What if I make a call to CryptGenRandom again in place of the
>> heap traversal routines?
>>
> Calling CryptGenRandom() twice isn't going to help much.
>
> If CryptGenRandom() is as good as it is "supposed to" be,
> the other entropy sources are not really needed.  But if
> CryptGenRandom() is somehow broken or untrustworthy,
> calling it a million times wouldn't help.
>
> Anyway, I have my doubts about the value of using the local
> heap walking functions as a source of entropy, as they
> reflect only the state of your own process.  Pretending that
> the address and size of each malloc()-ed memory block in
> your process contributes 3 to 5 bytes of additional entropy
> (which is what the comments say) is wildly optimistic and
> quite unrealistic.
>
> In a long-running web browser or a similarly long running
> web server, the net total of the memory layout effects of
> thousands of semi-chaotic previous network requests and
> user actions might contribute a total of 10 to 50 bits of
> entropy.  But in a typical freshly started process, the
> layout is going to be pretty deterministic (if the OS
> uses address layout randomization, it probably does so
> based on entropy sources already incorporated into its
> standard random source, i.e. CryptGenRandom() on Windows).
>
>
>> 3. Any other possible ways out?
>>
>> Thanks,
>> Sandeep
>>
>> On Thu, Feb 23, 2012 at 10:08 PM, Jakob Bohm <jb-open...@wisemo.com<mailto:
>> jb-open...@wisemo.com>**> wrote:
>>
>>    From the evidence given, I would *almost* certainly characterize
>>    this as a deadlock bug in ntdll.dll, the deepest, most trusted
>>    user mode component of Windows!
>>
>>    Specifically, nothing should allow regular user code such as
>>    OpenSSL to hold onto NT internal critical sections while not
>>    running inside NTDLL, and NTDLL should be designed not to
>>    deadlock against itself.
>>
>>    There is one other possibility though:
>>
>>    The OpenSSL code in rand_win.c holds on to a "snapshot" lock
>>    on some of the heap data while walking it.  It may be doing
>>    this in a way not permitted by the rules that are presumed
>>    by the deadlock avoidance design of the speed critical heap
>>    locking code.
>>
>>
>>    On 2/23/2012 2:11 PM, sandeep kiran p wrote:
>>
>>        Hi,
>>
>>        OpenSSL Version: 0.9.8o
>>        OS : Windows Server 2008 R2 SP1
>>
>>        I am seeing a deadlock in a windows application between two
>>        threads, one thread calling Heap32First from OpenSSL's
>>        RAND_poll and the other that allocates memory over the heap.
>>
>>        Here is the relevant stack trace from both the threads
>>        involved in deadlock.
>>
>>        Thread 523
>>        ----------------
>>        ntdll!ZwWaitForSingleObject+a
>>        ntdll!**RtlpWaitOnCriticalSection+e8
>>        ntdll!RtlEnterCriticalSection+**d1
>>        ntdll!RtlpAllocateHeap+18a6
>>        ntdll!RtlAllocateHeap+16c
>>        ntdll!RtlpAllocateUserBlock+**145
>>        ntdll!**RtlpLowFragHeapAllocFromContex**t+4e7
>>        ntdll!RtlAllocateHeap+e4
>>        ntdll!**RtlInitializeCriticalSectionEx**+d2
>>        ntdll!**RtlpActivateLowFragmentationHe**ap+181
>>        ntdll!**RtlpPerformHeapMaintenance+27
>>        ntdll!RtlpAllocateHeap+1819
>>        ntdll!RtlAllocateHeap+16c
>>
>>
>>        Thread 454
>>        -----------------
>>        ntdll!NtWaitForSingleObject+**0xa
>>        ntdll!**RtlpWaitOnCriticalSection+0xe8
>>        ntdll!RtlEnterCriticalSection+**0xd1
>>        ntdll!RtlLockHeap+0x3b
>>        ntdll!**RtlpQueryExtendedHeapInformati**on+0xf4
>>        ntdll!RtlQueryHeapInformation+**0x3c
>>        ntdll!**RtlQueryProcessHeapInformation**+0x3ad
>>        ntdll!**RtlQueryProcessDebugInformatio**n+0x3b0
>>        kernel32!Heap32First+0x71
>>
>>        WinDBG reports that thread 523 and 454 both hold locks and are
>>        waiting for each other locks thereby resulting in a deadlock.
>>
>>        On searching, I have found a couple instances where such an
>>        issue has been reported with Heap32Next on Windows 7 but
>>        haven't found anything that helps me solve the problem. Most
>>        of the references I found conclude that this could be because
>>        of a possible bug in heap traversal APIs. If someone has faced
>>        a similar problem, can you guide me to possible workarounds by
>>        which I can avoid the deadlock? Can I remove the heap
>>        traversal routines and find some other sources of entropy?
>>
>>        Thanks for your help.
>>
>>
>>  Enjoy
>
> Jakob
> --
> Jakob Bohm, CIO, Partner, WiseMo A/S.  http://www.wisemo.com
> Transformervej 29, 2730 Herlev, Denmark.  Direct +45 31 13 16 10
> This public discussion message is non-binding and may contain errors.
> WiseMo - Remote Service Management for PCs, Phones and Embedded
>
> ______________________________**______________________________**__________
> OpenSSL Project                                 http://www.openssl.org
> User Support Mailing List                    openssl-users@openssl.org
> Automated List Manager                           majord...@openssl.org
>

Re: Deadlock in RAND_poll's Heap32First call

Reply via email to