Hello,
Here i describe something that works very bad in all OS-es i tested it
at. I tested it at NT and Linux. w95/w98 i can't take serious enough to
even consider testing it at.
Consider i have 4 processes. First process sets variable in
shared memory struct Tree.
tree->waitforsearch = true;
Now other 3 processes get started.
They start till they get into the next loop:
do {
;
} while( tree->waitforsearch );
InitializeSettings();
while( tree->job ) {
tree->waitingformove[ProcessorNumber] = true;
do {
;
} while( tree->waitformove );
DoYourJob();
}
The first processor in the meantime is
waiting till all processors are in the tree->waitformove loop,
only then continue so that other processors can get a job.
So in fact at 2 places n-1 out of n processes must wait till they
all are at the same place (except for the first processor).
When starting a job, this synchronization must happen a few hundreds
of times, up to a thousand times. That must actually happen within
a second.
Now in the past the wait loops were more difficult. I've made it simpler now.
I deeply regret that.
Why do easy when it can be done difficult?
You probably laugh now. You shouldn't.
In the past i could test my program parallel at a single processor,
of course it was hard to detect real faults like that, but i could
at least test it. That can be done no longer.
If i start my program now at a single processor,
then synchronization takes very very long. Many minutes before
both processes get 50% system time.
when i start however 2 processes at a quad or dual machine,
then that 1000 synchronisations eat STILL 10 seconds or something,
also laughable if i compare to what it was.
Both NT and Linux completely f... up my program. When i got the
results first i was doubting whether it was the OS or my program.
I figured now out for sure it's the OSes that cause this problem,
they 'seem' to figure out somehow in this simple
implementation that a process is idle, yet
when this process badly needs systemtime,
it doesn't get it then.
I do not know what in the OS causes this problem. If i knew i might
work around it for the time being.
Anyway i feel quite well that it basically isn't my problem but a
serious OS problem. They're smart, but not smart enuf
to understand that i'm running a parallel program where it's very
important that seemingly idle processes get a big deal of systemtime,
instead of waiting for a second before getting it.
Greetings,
Vincent
-
Linux SMP list: FIRST see FAQ at http://www.irisa.fr/prive/mentre/smp-faq/
To Unsubscribe: send "unsubscribe linux-smp" to [EMAIL PROTECTED]