Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Jeff Squyres
Yes, it may well be... It needs to handle the case where paffinity can return "Sorry, I don't have this information for you." On Jul 23, 2008, at 11:40 AM, Lenny Verkhovsky wrote: can this also be a reason for seqv on NUMA nodes(#1382) , that I cant recreate ? On 7/23/08, Jeff Squyres

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Jeff Squyres
On Jul 23, 2008, at 11:48 AM, Terry Dontje wrote: Ok, so I thought I saw slow down with Solaris. Not sure it is the same thing (wouldn't think so) but I'll test this out soon. Ok. Check the state of the solaris paffinity component (I don't know what state it's in these days) and ensure t

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Jeff Squyres
Fixed in r19001. Please re-test; it fixes the problem for me (i.e., no need to manually specify sched_yield=0). BTW, this never came up before because: - the ODLS used to use paffinity, but before PLPA supported the topology stuff and therefore always returned the number of processors - wh

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Terry Dontje
Jeff Squyres wrote: On Jul 23, 2008, at 10:37 AM, Terry Dontje wrote: This seems to work for me too. What is interesting is my experiments have shown that if you run on RH5.1 you don't need to set mpi_yield_when_idle to 0. Yes, this makes sense -- on RHEL5.1, it's a much newer Linux kernel

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Lenny Verkhovsky
can this also be a reason for seqv on NUMA nodes(#1382) , that I cant recreate ? On 7/23/08, Jeff Squyres wrote: > > On Jul 23, 2008, at 10:37 AM, Terry Dontje wrote: > > This seems to work for me too. What is interesting is my experiments have >> shown that if you run on RH5.1 you don't need t

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Jeff Squyres
On Jul 23, 2008, at 10:37 AM, Terry Dontje wrote: This seems to work for me too. What is interesting is my experiments have shown that if you run on RH5.1 you don't need to set mpi_yield_when_idle to 0. Yes, this makes sense -- on RHEL5.1, it's a much newer Linux kernel and PLPA works as

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Ralph Castain
I added a tad more output to the debugging statement so you can see how many processors were found, how many children we have, and what the sched_yield will be set to... Besides, that way I got to be the one that hit r19000! On Jul 23, 2008, at 9:21 AM, Jeff Squyres wrote: It's PLPA that's

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Jeff Squyres
It's PLPA that's at fault here; I'm running on an older Linux kernel that doesn't have the topology information available. So PLPA is saying "can't give you anything, sorry" (to include how many processors are available) -- but that might not be true. I need to think about this a bit to co

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Ralph Castain
Here is a real simple test that will tell us a bunch about what is going on: run this again with -mca odls_base_verbose 5. You'll get some output, but what we are looking for specifically is a message that includes "launch oversubscribed set to...". This will tell us what ORTE -thinks- the

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Terry Dontje
This seems to work for me too. What is interesting is my experiments have shown that if you run on RH5.1 you don't need to set mpi_yield_when_idle to 0. --td Jeff Squyres wrote: Doh! I guess we still don't have that calculating right yet; I thought we had fixed that... [7:12] svbu-mpi052:

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Jeff Squyres
Doh! I guess we still don't have that calculating right yet; I thought we had fixed that... [7:12] svbu-mpi052:~/svn/ompi-tests/NetPIPE-3.7.1 % mpirun --mca mpi_paffinity_alone 1 -np 2 --mca btl sm,self --mca mpi_yield_when_idle 0 NPmpi 0: svbu-mpi052 1: svbu-mpi052 Now starting the main

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread George Bosilca
Can you try the HEAD with the mpi_yield_when_idle set to 0 please. Thanks, george. On Jul 23, 2008, at 3:39 PM, Jeff Squyres wrote: Short version: I'm seeing a large performance drop between r18850 and the SVN HEAD. Longer version: FWIW, I ran the tests on 3 versions on a woodcrest-

Re: [OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Jeff Squyres
Short version: I'm seeing a large performance drop between r18850 and the SVN HEAD. Longer version: FWIW, I ran the tests on 3 versions on a woodcrest-class x86_64 machine running RHEL4U4: * Trunk HEAD (r18997) * r18973 --> had to patch the cpu64* thingy in openib btl to get it to compil

[OMPI devel] Fwd: [OMPI bugs] [Open MPI] #1250: Performance problem on SM

2008-07-23 Thread Lenny Verkhovsky
Sorry Terry, :). -- Forwarded message -- From: Lenny Verkhovsky List-Post: devel@lists.open-mpi.org Date: Jul 23, 2008 2:22 PM Subject: Re: [OMPI devel] [OMPI bugs] [Open MPI] #1250: Performance problem on SM To: Lenny Berkhovsky On 7/23/08, Terry Dontje wrote: > > I didn't s