On 07/07/11 09:27, Andriy Gapon wrote:
on 06/07/2011 21:11 Nathan Whitehorn said the following:
On 07/06/11 13:00, Steve Kargl wrote:
AFAICT, it is a cpu affinity issue.  If I launch n+1 MPI images
on a system with n cpus/cores, then 2 (and sometimes 3) images
are stuck on a cpu and those 2 (or 3) images ping-pong on that
cpu.  I recall trying to use renice(8) to force some load
balancing, but vaguely remember that it did not help.
I've seen exactly this problem with multi-threaded math libraries, as well.
Exactly the same?  Let's see.

Using parallel GotoBLAS on FreeBSD gives terrible performance because the
threads keep migrating between CPUs, causing frequent cache misses.
So Steve reports that if he has Nthr>  Ncpu, then some threads are "over-glued"
to a particular CPU, which results in sub-optimal scheduling for those threads.
  I have to guess that Steve would want to see the threads being shuffled 
between
CPUs to produce more even CPU load.

On the other hand, you report that your threads keep being shuffled between CPUs
(I presume for Nthr == Ncpu case, where Nthr is a count of the number-crunching
threads).  And I guess that you want them to stay glued to particular CPUs.

So how is this the same problem?  In fact, it sounds like somewhat opposite.
The only thing in common is that you both don't like how ULE works.

ULE has many knobs to tune its behavior.  Unfortunately they are not very well
documented and there are too many of them.  So, it's not easy to find which
combination would be the best for a particular work-load.  In your particular
case you might want to try to increase value of kern.sched.affinity to increase
affinity of threads to their CPUs.
Not all of those using FreeBSD are developer or experts, even experts of a very specific
area of computer science and engineering or  a particular subject of the
FreeBSD kernel and its techniques of scheduling. I'm not capable of tuning my servers via a lot of undocumented knobs, I'm sorry. I'd like to do if there would be a kind of howto
(handbook?).


Also, please note that FreeBSD support in GotoBLAS is not equivalent to Linux
support as I have pointed out before.  On Linux they bind their threads to CPUs
to avoid the situation that you describe.  Apparently they didn't know how to do
CPU-binding on FreeBSD, so this is not implemented.  You may have a motivation
to help them out with this.


_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to