Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-13 Thread Chris Jewell
Hi Dave, Reuti,

Sorry for kicking off this thread, and then disappearing.  I've been away for a 
bit.  Anyway, Dave, I'm glad you experienced the same issue as I had with my 
installation of SGE 6.2u5 and OpenMPI with core binding -- namely that with 
'qsub -pe openmpi 8 -binding set linear:1 ', if two or more of 
the parallel processes get scheduled to the same execution node, then the 
processes end up being bound to the same core.  Not good!

I've been playing around quite a bit trying to understand this issue, and ended 
up on the GE dev list:

http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=39=285878

It seems that most people expect that calls to 'qrsh -inherit' (that I assume 
OpenMPI uses to bind parallel processes to reserved GE slots) activates a 
separate binding.  This does not appear to be the case.  I *was* hoping that 
using -binding pe linear:1 might enable me to write a script that read the 
pe_hostfile and created a machine file for OpenMPI, but this fails as GE does 
not appear to give information as to which cores are unbound, only the number 
required.

So, for now, my solution has been to use a JSV to remove core binding for the 
MPI jobs (but retain it for serial and SMP jobs).  Any more ideas??

Cheers,

Chris

(PS. Dave: how is my alma mater these days??)
--
Dr Chris Jewell
Department of Statistics
University of Warwick
Coventry
CV4 7AL
UK
Tel: +44 (0)24 7615 0778








[OMPI users] Solaris10/SPARC: atomic_cmpset_64 broken

2010-11-13 Thread Nicolai Stange
Hi everybody,

gcc 4.5.1 with -O2 optimizes the 'ret = newval' away because %0 is
declared as being write only.
Fix is attached.

Regards

Nicolai
--- a/openmpi-1.4.3/opal/include/opal/sys/sparcv9/atomic.h	2009-12-08 21:36:02.0 +0100
+++ openmpi-1.4.3/opal/include/opal/sys/sparcv9/atomic.h	2010-11-12 21:20:28.356657500 +0100
@@ -159,7 +159,7 @@
"ldx %2, %%g2   \n\t" /* g2 = oldval */
"casxa [%1] " ASI_P ", %%g2, %%g1 \n\t"
"stx %%g1, %0   \n"
-   : "=m"(ret)
+   : "+m"(ret)
: "r"(addr), "m"(oldval)
: "%g1", "%g2"
);


[OMPI users] source code for presentation/papers

2010-11-13 Thread Vasiliy G Tolstov
Hello. I read very good paper about xenmpi and interdomain communication
(http://www.open-mpi.org/papers/trinity-btl-2009/xenmpi_report.pdf)

Documents contains some instruction how to build xensocket and xen btl
with openmpi. Where i can find the source?

-- 
Vasiliy G Tolstov 
Selfip.Ru