Hi, I'm doing some characterization of multithreaded applications from the ALPbench benchmark suite and I have an issue with a part of the acquired speedup curve.
I'm simulating a 4-way, 8-way, and a 16-way SPARC multicore architecture with a perfect memory system that runs Solaris. I'm increasing the number of threads in the following way: 1, 2, 4, 8, 16. The strangeness appears when I compare the 8-way and 16-way architecture performance when I run 8 threads. It seems like I get a significantly higher performance when I increase the number of processors from 8 to 16 - even though there are only 8 threads created in both cases. My theory is that the scheduler does a better job in the 16-way case than in the 8-way case, but I'm not sure why this could be case, maybe someone can confirm this? I have noticed that the scheduler often throw around a thread among different processors. An improvement would be to make the thread stay at the processor it was first assigned to, but I don't know how to achieve this. I've heard about pbind, but to my understanding that only restricts the _process_ to a certain processor set, not the _threads_, right? Regards, Mladen This message posted from opensolaris.org _______________________________________________ opensolaris-code mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/opensolaris-code
