[OMPI users] oversubscription of slots with GridEngine

2014-11-09 Thread SLIM H.A.
We switched on hyper threading on our cluster with two eight core sockets per node (32 threads per node). We configured gridengine with 16 slots per node to allow the 16 extra threads for kernel process use but this apparently does not work. Printout of the gridengine hostfile shows that for a

Re: [OMPI users] oversubscription of slots with GridEngine

2014-11-11 Thread SLIM H.A.
um 18:20 schrieb SLIM H.A. mailto:h.a.s...@durham.ac.uk>>: We switched on hyper threading on our cluster with two eight core sockets per node (32 threads per node). We configured gridengine with 16 slots per node to allow the 16 extra threads for kernel process use but this apparently d

[OMPI users] orte-odls-default:execv-error

2011-04-05 Thread SLIM H.A.
After an upgrade of our system I receive the following error message (openmpi 1.4.2 with gridengine): >quote -- Sorry! You were supposed to get help about: orte-odls-default:execv-error But I couldn't open the help file:

Re: [OMPI users] orte-odls-default:execv-error

2011-04-05 Thread SLIM H.A.
On Behalf Of Terry Dontje Sent: 05 April 2011 11:21 To: us...@open-mpi.org Subject: Re: [OMPI users] orte-odls-default:execv-error On 04/05/2011 05:11 AM, SLIM H.A. wrote: After an upgrade of our system I receive the following error message (openmpi 1.4.2 with gridengine):

Re: [OMPI users] orte-odls-default:execv-error

2011-04-05 Thread SLIM H.A.
ti > Sent: 05 April 2011 11:23 > To: Open MPI Users > Subject: Re: [OMPI users] orte-odls-default:execv-error > > Am 05.04.2011 um 11:11 schrieb SLIM H.A.: > > > After an upgrade of our system I receive the following error message > > (openmpi 1.4.2 with gridengine): &

[OMPI users] openmpi fails on mx endpoint busy

2007-07-05 Thread SLIM H.A.
Hello I have compiled openmpi-1.2.3 with the --with-mx= configuration and gcc compiler. On testing with 4-8 slots I get an error message, the mx ports are busy: >mpirun --mca btl mx,self -np 4 ./cpi [node001:10071] mca_btl_mx_init: mx_open_endpoint() failed with status=20 [node001:10074] mca_btl

Re: [OMPI users] openmpi fails on mx endpoint busy

2007-07-06 Thread SLIM H.A.
her use MX's shared > memory support, instead use '--mca btl mx,self --mca > btl_mx_shared_mem 1'. However, in most cases I believe Open > MPI's shared memory support is a bit better. > > Alternatively, if you don't specify any btls, Open MPI should

Re: [OMPI users] openmpi fails on mx endpoint busy

2007-07-06 Thread SLIM H.A.
ers] openmpi fails on mx endpoint busy If the machine is multi-processor you might want to add the sm btl. That cleared up some similar problems for me, though I don't use mx so your millage may vary. On 7/5/07, SLIM H.A. wrote:

Re: [OMPI users] openmpi fails on mx endpoint busy

2007-07-09 Thread SLIM H.A.
n? Thanks Henk > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Tim Prins > Sent: 06 July 2007 15:59 > To: Open MPI Users > Subject: Re: [OMPI users] openmpi fails on mx endpoint busy > > Henk, > > O

Re: [OMPI users] openmpi fails on mx endpoint busy

2007-07-10 Thread SLIM H.A.
ins > Sent: 09 July 2007 16:34 > To: Open MPI Users > Subject: Re: [OMPI users] openmpi fails on mx endpoint busy > > SLIM H.A. wrote: > > > > Dear Tim and Scott > > > > I followed the suggestions made: > > > >> So you should either pass '

[OMPI users] sge qdel fails

2007-07-23 Thread SLIM H.A.
I am using OpenMPI 1.2.3 with SGE 6.0u7 over InfiniBand (OFED 1.2), following the recommendation in the OpenMPI FAQ http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge The job runs but when the user wants to delete the job with the qdel command, this fails. Does the mpirun command mpi

Re: [OMPI users] sge qdel fails

2007-07-23 Thread SLIM H.A.
omething like > /tmp/174.1.all.q/) which contains the OMPI session dir for > the running job, and in turns would cause orted and the user > processes to exit. > > Maybe you could try qdel -f to force delete from the > sge_qmaster, in case when sge_execd does not respond to the

[OMPI users] infiniband

2008-04-27 Thread SLIM H.A.
Is it possible to get information about the usage of hca ports similar to the result of the mx_endpoint_info command for Myrinet boards? The ibstat command gives information like this: Port 1: State: Active Physical state: LinkUp but does not say whether a job is actually using an infiniband po

[OMPI users] using OpenMPI + SGE in a heterogeneous network

2008-06-06 Thread SLIM H.A.
Hi I want to use SGE to run jobs on a cluster with mx and infiniband nodes. By dividing the nodes into two host groups SGE will submit to either interconnect. The interconnect can be specified in the mpirun command with the --mca btl parameter. However users would then have to decide at runtime

Re: [OMPI users] using OpenMPI + SGE in a heterogeneous network

2008-06-06 Thread SLIM H.A.
:31 schrieb Patrick Geoffray: > SLIM H.A. wrote: >> I would be grateful for any advice > > Just to check, you are not using the MTL for MX, right ? Only the BTL > interface allow to choose between several devices at run time. At least there would be the option to built two binar

Re: [OMPI users] using OpenMPI + SGE in a heterogeneous network

2008-06-07 Thread SLIM H.A.
un...@open-mpi.org on behalf of John Hearns Sent: Sat 6/7/2008 12:07 AM To: Open MPI Users Subject: Re: [OMPI users] using OpenMPI + SGE in a heterogeneous network On Fri, 2008-06-06 at 17:56 +0100, SLIM H.A. wrote: > Hi > > I want to use SGE to run jobs on a cluster with mx and infiniband n

Re: [OMPI users] using OpenMPI + SGE in a heterogeneous network

2008-06-07 Thread SLIM H.A.
8 6:47 PM > To: Open MPI Users > Subject: Re: [OMPI users] using OpenMPI + SGE in a heterogeneous > network > > Am 06.06.2008 um 19:31 schrieb Patrick Geoffray: > > > SLIM H.A. wrote: > >> I would be grateful for any advice > > > > Just to check, you are n

[OMPI users] btl parameter is not set to openib on node with ib card

2008-06-17 Thread SLIM H.A.
Hi OpenMPI does not pick up the infiniband component on our nodes with Mellanox cards: ompi_info --param btl openib returns MCA btl: parameter "btl_base_debug" (current value: "0") If btl_base_debug is 1 standard debug is output, if > 1 verbose debug is output MCA btl: parameter "btl"

Re: [OMPI users] btl parameter is not set to openib on node with ibcard

2008-06-17 Thread SLIM H.A.
open-mpi.org/faq/?category=tuning#setting-mca-params > > > On Jun 17, 2008, at 10:49 AM, SLIM H.A. wrote: > > > > > Hi > > > > OpenMPI does not pick up the infiniband component on our nodes with > > Mellanox cards: > > > > ompi_info --pa

[OMPI users] Error in mx_init message

2008-06-18 Thread SLIM H.A.
I have OpenMPI-1.2.5 configured with myrinet and infiniband: OMPI_MCA_btl=openib,self,sm The job runs with the "error" message "Error in mx_init (error MX driver not loaded.)" which makes sense in itself as there is no myrinet card on the node. Is it correct to assume that the ib interconnec

Re: [OMPI users] Error in mx_init message

2008-06-18 Thread SLIM H.A.
> environment, then this warning will go away. > > george. > > On Jun 18, 2008, at 4:22 PM, SLIM H.A. wrote: > > > > > I have OpenMPI-1.2.5 configured with myrinet and infiniband: > > > > OMPI_MCA_btl=openib,self,sm > > > > The job runs

Re: [OMPI users] Error in mx_init message

2008-06-18 Thread SLIM H.A.
; the card will be used. If you only specify the btls then Open > MPI will try to initialize the CM PML and that's how this > error message appears. If you add OMPI_MCA_pml=^cm to your > environment, then this warning will go away. > >george. > > On Jun 18, 2008, at

Re: [OMPI users] Error in mx_init message

2008-06-19 Thread SLIM H.A.
m and the others. Basically, the difference is how > the card will be used. If you only specify the btls then Open > MPI will try to initialize the CM PML and that's how this > error message appears. If you add OMPI_MCA_pml=^cm to your > environment, then this warning will go a

[OMPI users] where is opal_install_dirs?

2008-10-10 Thread SLIM H.A.
I tried building Global Arrays with OpenMPI 1.2.3 and the portland compilers 7.0.2. It gives an error message about an undefined symbol "opal_install_dirs": mpif90 -O -i8 -c -o dgetf2.o dgetf2.f mpif90: symbol lookup error: mpif90: undefined symbol: opal_install_dirs make[1]: *** [dgetf2.o] Erro

Re: [OMPI users] ompi mca mxm version

2012-07-05 Thread SLIM H.A.
Hi Do you have any details about the performance of mxm, e.g. for real applications? Thanks Henk From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Mike Dubman Sent: 11 May 2012 19:23 To: Open MPI Users Subject: Re: [OMPI users] ompi mca mxm version ob1/openib

[OMPI users] orted: error while loading shared libraries

2010-04-08 Thread SLIM H.A.
Dear OpenMPI users We built OpenMPI 1.4.1 on a new cluster and get the following error message when starting a test job from the master node: ham4#mpirun -np 4 --host cn001 /path/hello orted: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or dire

[OMPI users] ga-4.1 on mx segmentation violation

2008-10-21 Thread SLIM H.A.
I have built the release candidate for ga-4.1 with OpenMPI 1.2.3 and portland compilers 7.0.2 for Myrinet mx. Running the test.x for 3 Myrinet nodes each with 4 cores I get the following error messages: warning:regcache incompatible with malloc libibverbs: Fatal: couldn't read uverbs ABI version

Re: [OMPI users] ga-4.1 on mx segmentation violation

2008-10-23 Thread SLIM H.A.
ctober 2008 22:32 > To: Open MPI Users > Subject: Re: [OMPI users] ga-4.1 on mx segmentation violation > > SLIM H.A. wrote: > > I have built the release candidate for ga-4.1 with OpenMPI > 1.2.3 and > > portland compilers 7.0.2 for Myrinet mx. > > Which versi

[OMPI users] Error in mx_init (error MX library incompatible with driver version)

2009-06-19 Thread SLIM H.A.
This is a question I raised before but for OpenMPI over IB. I have build the app with the Portland compiler and OpenMPI 1.2.3 for Myrinet and InfiniBand. Now I wish to run this on some nodes that have no fast interconnect. We use GridEngine, this is the script: #!/bin/csh #$ -cwd ##$ -j y module

Re: [OMPI users] Error in mx_init (error MX library incompatiblewith driver version)

2009-06-21 Thread SLIM H.A.
Scott Atchley Sent: 19 June 2009 23:23 To: Open MPI Users Subject: Re: [OMPI users] Error in mx_init (error MX library incompatiblewith driver version) On Jun 19, 2009, at 1:05 PM, SLIM H.A. wrote: > Although the mismatch between MX lib version and the kernel version > appears to cause the m