On Sun, 2 Jul 2000, Arne Ansper wrote:
>
>
> > Hmm, I was able to create 354406 unnamed mutexes, before
> > CreateMutex() failed with ERROR_NOT_ENOUGH_QUOTA. Tested under
> > NT4 SP6.
>
> btw, why use mutexes at all? openssl uses only unnamed mutexes and always
> waits indefinitly long on mutex. so we could use critical sections
> instead. they are much faster and basically
> unlimited. InitializeCriticalSection etc.
>
Thats true, indeed. I realized this shortly after posting my
previous letter :)
Concerning relative performance of mutexes and critical sections
- I wrote a simple test program. Code is MSC specific.
-------------------8<-----------------------------------------------
#include <windows.h>
#include <stdio.h>
#include <assert.h>
BOOL (*set_spin_count)() = NULL;
DWORD dummy_set_spin_count(CRITICAL_SECTION *s, DWORD c)
{
return 0;
}
__int64
read_pentium_timestamp_counter(void)
{
__asm cpuid;
__asm rdtsc;
}
int main()
{
unsigned i;
HANDLE h, kernel32;
__int64 t0,t1, t;
CRITICAL_SECTION cs;
BOOL blah;
h = CreateMutex(NULL, FALSE, NULL);
assert(h);
/*
* Setting spin count makes sense on SMP systems, especially
* when critical sections are locked for a very short time.
* Spin count indicates to EnterCriticalSection(), how many times
* should it try to lock critical section in userland before it
* decides that locking is going to take a lot of time anyway
* and invokes WaitForSingleObject system call (which is expensive
* due to context switches) on mutex associated with critical section.
* Spin count is ignored on single CPU systems (according to m$ docs).
*
* This function requires a recent windows version, therefore
* we try to figure out its address dynamically.
*/
kernel32 = GetModuleHandle("kernel32.dll");
assert(kernel32);
set_spin_count = (DWORD (*)())GetProcAddress(kernel32,
"SetCriticalSectionSpinCount");
if (set_spin_count == NULL) {
printf("%s not found in %s\n", "SetCriticalSectionSpinCount()",
"kernel32.dll");
set_spin_count = dummy_set_spin_count;
}
InitializeCriticalSection(&cs);
printf("Setting spin count to %u, previous value was %u\n",
8192, set_spin_count(&cs, 8192));
printf("Size of CRITICAL_SECTION: %u\n\n", sizeof(CRITICAL_SECTION));
for (i = 0, t = 0; i < 1024; ++i) {
t0 = read_pentium_timestamp_counter();
t1 = read_pentium_timestamp_counter();
t += t1 - t0;
}
t /= i;
/*
* Value of t may not be precise, because first
* few cpuid instruction invocations in a tight loop
* will take somewhat longer than subsequent ones, by
* 5..6 ticks or so.
*/
printf("%-40s%5I64u CPU ticks (total overhead)\n",
"read_pentium_timestamp_counter()", t);
t0 = read_pentium_timestamp_counter();
EnterCriticalSection(&cs);
t1 = read_pentium_timestamp_counter();
printf("%-40s%5I64u CPU ticks\n", "EnterCriticalSection()",
t1 - t0 - t);
t0 = read_pentium_timestamp_counter();
LeaveCriticalSection(&cs);
t1 = read_pentium_timestamp_counter();
printf("%-40s%5I64u CPU ticks\n", "LeaveCriticalSection()",
t1 - t0 - t);
t0 = read_pentium_timestamp_counter();
switch (WaitForSingleObject(h, INFINITE)) {
case WAIT_OBJECT_0:
break;
default:
assert(0);
}
t1 = read_pentium_timestamp_counter();
printf("%-40s%5I64u CPU ticks\n", "WaitForSingleObject()",
t1 - t0 - t);
t0 = read_pentium_timestamp_counter();
ReleaseMutex(h);
t1 = read_pentium_timestamp_counter();
printf("%-40s%5I64u CPU ticks\n", "ReleaseMutex()", t1 - t0 - t);
CloseHandle(h);
return 0;
}
---------------------------------------->8---------------------------------
Sample output:
Setting spin count to 8192, previous value was 0
Size of CRITICAL_SECTION: 24
read_pentium_timestamp_counter() 132 CPU ticks (total overhead)
EnterCriticalSection() 153 CPU ticks
LeaveCriticalSection() 86 CPU ticks
WaitForSingleObject() 8361 CPU ticks
ReleaseMutex() 2321 CPU ticks
BTW, I just discovered that redirecting stdout to some file
improves performance significantlyly. Here's a output on the same
system with output redirected to some file:
Setting spin count to 8192, previous value was 0
Size of CRITICAL_SECTION: 24
read_pentium_timestamp_counter() 132 CPU ticks (total overhead)
EnterCriticalSection() 75 CPU ticks
LeaveCriticalSection() 10 CPU ticks
WaitForSingleObject() 5862 CPU ticks
ReleaseMutex() 1500 CPU ticks
I do not have good explanation to this. hm.
Have fun :)
--
vix
______________________________________________________________________
OpenSSL Project http://www.openssl.org
Development Mailing List [EMAIL PROTECTED]
Automated List Manager [EMAIL PROTECTED]