Re: [OMPI users] oversubscription of slots with GridEngine

2014-11-11 Thread SLIM H.A.
;> wrote: Hi, Am 09.11.2014 um 18:20 schrieb SLIM H.A. <h.a.s...@durham.ac.uk<mailto:h.a.s...@durham.ac.uk>>: We switched on hyper threading on our cluster with two eight core sockets per node (32 threads per node). We configured gridengine with 16 slots per node to allo

[OMPI users] oversubscription of slots with GridEngine

2014-11-09 Thread SLIM H.A.
We switched on hyper threading on our cluster with two eight core sockets per node (32 threads per node). We configured gridengine with 16 slots per node to allow the 16 extra threads for kernel process use but this apparently does not work. Printout of the gridengine hostfile shows that for

Re: [OMPI users] ompi mca mxm version

2012-07-05 Thread SLIM H.A.
Hi Do you have any details about the performance of mxm, e.g. for real applications? Thanks Henk From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Mike Dubman Sent: 11 May 2012 19:23 To: Open MPI Users Subject: Re: [OMPI users] ompi mca mxm version ob1/openib

[OMPI users] orted: error while loading shared libraries

2010-04-08 Thread SLIM H.A.
Dear OpenMPI users We built OpenMPI 1.4.1 on a new cluster and get the following error message when starting a test job from the master node: ham4#mpirun -np 4 --host cn001 /path/hello orted: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or

Re: [OMPI users] Error in mx_init (error MX library incompatiblewith driver version)

2009-06-21 Thread SLIM H.A.
Of Scott Atchley Sent: 19 June 2009 23:23 To: Open MPI Users Subject: Re: [OMPI users] Error in mx_init (error MX library incompatiblewith driver version) On Jun 19, 2009, at 1:05 PM, SLIM H.A. wrote: > Although the mismatch between MX lib version and the kernel version > appears to cause the m

[OMPI users] Error in mx_init (error MX library incompatible with driver version)

2009-06-19 Thread SLIM H.A.
This is a question I raised before but for OpenMPI over IB. I have build the app with the Portland compiler and OpenMPI 1.2.3 for Myrinet and InfiniBand. Now I wish to run this on some nodes that have no fast interconnect. We use GridEngine, this is the script: #!/bin/csh #$ -cwd ##$ -j y

Re: [OMPI users] ga-4.1 on mx segmentation violation

2008-10-23 Thread SLIM H.A.
r 2008 22:32 > To: Open MPI Users > Subject: Re: [OMPI users] ga-4.1 on mx segmentation violation > > SLIM H.A. wrote: > > I have built the release candidate for ga-4.1 with OpenMPI > 1.2.3 and > > portland compilers 7.0.2 for Myrinet mx. > > Which version of

[OMPI users] ga-4.1 on mx segmentation violation

2008-10-21 Thread SLIM H.A.
I have built the release candidate for ga-4.1 with OpenMPI 1.2.3 and portland compilers 7.0.2 for Myrinet mx. Running the test.x for 3 Myrinet nodes each with 4 cores I get the following error messages: warning:regcache incompatible with malloc libibverbs: Fatal: couldn't read uverbs ABI

[OMPI users] where is opal_install_dirs?

2008-10-10 Thread SLIM H.A.
I tried building Global Arrays with OpenMPI 1.2.3 and the portland compilers 7.0.2. It gives an error message about an undefined symbol "opal_install_dirs": mpif90 -O -i8 -c -o dgetf2.o dgetf2.f mpif90: symbol lookup error: mpif90: undefined symbol: opal_install_dirs make[1]: *** [dgetf2.o]

Re: [OMPI users] Error in mx_init message

2008-06-19 Thread SLIM H.A.
es: cm and the others. Basically, the difference is how > the card will be used. If you only specify the btls then Open > MPI will try to initialize the CM PML and that's how this > error message appears. If you add OMPI_MCA_pml=^cm to your > environment, then this warning will go away.

Re: [OMPI users] Error in mx_init message

2008-06-18 Thread SLIM H.A.
; the card will be used. If you only specify the btls then Open > MPI will try to initialize the CM PML and that's how this > error message appears. If you add OMPI_MCA_pml=^cm to your > environment, then this warning will go away. > >george. > > On Jun 18, 2008, at 4:22

Re: [OMPI users] Error in mx_init message

2008-06-18 Thread SLIM H.A.
environment, then this warning will go away. > > george. > > On Jun 18, 2008, at 4:22 PM, SLIM H.A. wrote: > > > > > I have OpenMPI-1.2.5 configured with myrinet and infiniband: > > > > OMPI_MCA_btl=openib,self,sm > > > > The job runs with

[OMPI users] Error in mx_init message

2008-06-18 Thread SLIM H.A.
I have OpenMPI-1.2.5 configured with myrinet and infiniband: OMPI_MCA_btl=openib,self,sm The job runs with the "error" message "Error in mx_init (error MX driver not loaded.)" which makes sense in itself as there is no myrinet card on the node. Is it correct to assume that the ib

Re: [OMPI users] btl parameter is not set to openib on node with ibcard

2008-06-17 Thread SLIM H.A.
open-mpi.org/faq/?category=tuning#setting-mca-params > > > On Jun 17, 2008, at 10:49 AM, SLIM H.A. wrote: > > > > > Hi > > > > OpenMPI does not pick up the infiniband component on our nodes with > > Mellanox cards: > > > > o

[OMPI users] btl parameter is not set to openib on node with ib card

2008-06-17 Thread SLIM H.A.
Hi OpenMPI does not pick up the infiniband component on our nodes with Mellanox cards: ompi_info --param btl openib returns MCA btl: parameter "btl_base_debug" (current value: "0") If btl_base_debug is 1 standard debug is output, if > 1 verbose debug is output MCA btl: parameter "btl"

Re: [OMPI users] using OpenMPI + SGE in a heterogeneous network

2008-06-07 Thread SLIM H.A.
7 PM > To: Open MPI Users > Subject: Re: [OMPI users] using OpenMPI + SGE in a heterogeneous > network > > Am 06.06.2008 um 19:31 schrieb Patrick Geoffray: > > > SLIM H.A. wrote: > >> I would be grateful for any advice > > > > Just to check, you are not using t

[OMPI users] using OpenMPI + SGE in a heterogeneous network

2008-06-06 Thread SLIM H.A.
Hi I want to use SGE to run jobs on a cluster with mx and infiniband nodes. By dividing the nodes into two host groups SGE will submit to either interconnect. The interconnect can be specified in the mpirun command with the --mca btl parameter. However users would then have to decide at runtime

[OMPI users] infiniband

2008-04-27 Thread SLIM H.A.
Is it possible to get information about the usage of hca ports similar to the result of the mx_endpoint_info command for Myrinet boards? The ibstat command gives information like this: Port 1: State: Active Physical state: LinkUp but does not say whether a job is actually using an infiniband

Re: [OMPI users] sge qdel fails

2007-07-23 Thread SLIM H.A.
ing like > /tmp/174.1.all.q/) which contains the OMPI session dir for > the running job, and in turns would cause orted and the user > processes to exit. > > Maybe you could try qdel -f to force delete from the > sge_qmaster, in case when sge_execd does not respond to the >

[OMPI users] sge qdel fails

2007-07-23 Thread SLIM H.A.
I am using OpenMPI 1.2.3 with SGE 6.0u7 over InfiniBand (OFED 1.2), following the recommendation in the OpenMPI FAQ http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge The job runs but when the user wants to delete the job with the qdel command, this fails. Does the mpirun command

Re: [OMPI users] openmpi fails on mx endpoint busy

2007-07-09 Thread SLIM H.A.
Henk > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Tim Prins > Sent: 06 July 2007 15:59 > To: Open MPI Users > Subject: Re: [OMPI users] openmpi fails on mx endpoint busy > > Henk, > > On Friday 0

[OMPI users] openmpi fails on mx endpoint busy

2007-07-05 Thread SLIM H.A.
Hello I have compiled openmpi-1.2.3 with the --with-mx= configuration and gcc compiler. On testing with 4-8 slots I get an error message, the mx ports are busy: >mpirun --mca btl mx,self -np 4 ./cpi [node001:10071] mca_btl_mx_init: mx_open_endpoint() failed with status=20 [node001:10074]