Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Joe Landman
On 03/27/2014 01:53 PM, Sasso, John (GE Power & Water, Non-GE) wrote: When a piece of software built against OpenMPI fails, I will see an error referring to the rank of the MPI task which incurred the failure. For example: MPI_ABORT was invoked on rank 1236 in communicator MPI_COMM_WORLD with

Re: [OMPI users] IO performance

2012-02-04 Thread Joe Landman
On 02/03/2012 01:46 PM, Tom Rosmond wrote: Recently the organization I work for bought a modest sized Linux cluster for running large atmospheric data assimilation systems. In my experience a glaring problem with systems of this kind is poor IO performance. Typically they have 2 types of

Re: [OMPI users] How closely tied is a specific release of OpenMPI to the host operating system and other system software?

2011-02-02 Thread Joe Landman
On 2/1/2011 5:02 PM, Jeffrey A Cummings wrote: I use OpenMPI on a variety of platforms: stand-alone servers running Solaris on sparc boxes and Linux (mostly CentOS) on AMD/Intel boxes, also Linux (again CentOS) on large clusters of AMD/Intel boxes. These platforms all have some version of the

Re: [OMPI users] very bad parallel scaling of vasp using openmpi

2009-09-23 Thread Joe Landman
Rahul Nabar wrote: On Tue, Aug 18, 2009 at 5:28 PM, Gerry Creager wrote: Most of that bandwidth is in marketing... Sorry, but it's not a high performance switch. Well, how does one figure out what exactly is a "hih performance switch"? I've found this an exceedingly

Re: [OMPI users] very bad parallel scaling of vasp using openmpi

2009-08-18 Thread Joe Landman
Craig Plaisance wrote: The switch we are using (Dell Powerconnect 6248) has a switching fabric capacity of 184 Gb/s, which should be more than adequate for the 48 ports. Is this the same as backplane bandwidth? Yes. If you are getting the behavior you describe, you are not getting all that

Re: [OMPI users] very bad parallel scaling of vasp using openmpi

2009-08-18 Thread Joe Landman
Craig Plaisance wrote: mpich2 now and post the results. So, does anyone know what causes the wild oscillations in the throughput at larger message sizes and higher network traffic? Thanks! Your switch can't handle this amount of traffic on its backplane. We have seen this often in

Re: [OMPI users] very bad parallel scaling of vasp using openmpi

2009-08-18 Thread Joe Landman
Craig Plaisance wrote: Hi - I have compiled vasp 4.6.34 using the Intel fortran compiler 11.1 with openmpi 1.3.3 on a cluster of 104 nodes running Rocks 5.2 with two quad core opterons connected by a Gbit ethernet. Running in parallel on Latency of gigabit is likely your issue. Lower

Re: [OMPI users] Problem getting OpenMPI to run

2009-06-01 Thread Joe Landman
Jeff Layton wrote: Jeff Squyres wrote: On Jun 1, 2009, at 2:04 PM, Jeff Layton wrote: error: executing task of job 3084 failed: execution daemon on host "compute-2-2.local" didn't accept task This looks like an error message from the resource manager/scheduler -- not from OMPI (i.e., OMPI

Re: [OMPI users] Any scientific application heavilyusing MPI_Barrier?

2009-03-05 Thread Joe Landman
Jeff Squyres wrote: On Mar 5, 2009, at 10:33 AM, Gerry Creager wrote: We've been playing with it in a coupled atmosphere-ocean model to allow the two to synchronize and exchange data. The models have differing levels of physics complexity and the time step requirements are significantly

Re: [OMPI users] openmpi over tcp

2009-01-29 Thread Joe Landman
Daniel De Marco wrote: Hi Ralph, * Ralph Castain [01/29/2009 14:27]: It is quite likely that you have IPoIB on your system. In that case, the TCP BTL will pickup that interface and use it. If you have a specific interface you want to use, try -mca btl_tcp_if_include eth0 (or

Re: [OMPI users] mpiblast + openmpi + gridengine job faila to run

2008-12-24 Thread Joe Landman
Reuti wrote: Hi, Am 24.12.2008 um 07:55 schrieb Sangamesh B: Thanks Reuti. That sorted out the problem. Now mpiblast is able to run, but only on single node. i.e. mpiformatdb -> 4 fragments, mpiblast - 4 processes. Since each node is having 4 cores, the job will run on a single node and

Re: [OMPI users] problems with MPI_Waitsome/MPI_Allstart and OpenMPI on gigabit and IB networks

2008-07-20 Thread Joe Landman
update 2: (its like I am talking to myself ... :) must start using decaf ...) Joe Landman wrote: Joe Landman wrote: [...] ok, fixed this. Turns out we have ipoib going, and one adapter needed to be brought down and back up. Now the tcp version appears to be running, though I do get

Re: [OMPI users] problems with MPI_Waitsome/MPI_Allstart and OpenMPI on gigabit and IB networks

2008-07-20 Thread Joe Landman
Joe Landman wrote: 3) using btl to turn off sm and openib, generates lots of these messages: [c1-8][0,1,4][btl_tcp_endpoint.c:572:mca_btl_tcp_endpoint_complete_connect] connect() failed with errno=113 [...] No route to host at -e line 1. This is wrong, all the nodes are visible from all

[OMPI users] problems with MPI_Waitsome/MPI_Allstart and OpenMPI on gigabit and IB networks

2008-07-20 Thread Joe Landman
Hi folks: This is a deeper dive into the code that was giving me fits over the last two weeks. It uses MPI_Waitsome and MPI_Allstart to launch/monitor progress. More on that in a moment. The testing I have done to date on this platform suggests that OpenMPI is working fine, though I

Re: [OMPI users] GCC extendability to OpenMPI Specification

2008-06-04 Thread Joe Landman
Mukesh K Srivastava wrote: Hi OMPI Community. Is there any thought process to extend GCC support to OpenMPI or implementation of OpenMPI specification in GCC for C, C++ & Fortran and making it generally available for platforms which supports POSIX. Hi Mukesh: Open MPI is already written

[OMPI users] crash with mpiBLAST

2008-05-07 Thread Joe Landman
Hi Open-MPI team: I am working on a build of mpiBLAST 1.5.0-pio, and found that the code crashes immediately after launch with a seg fault. I used Open-MPI 1.2.6 built from the tarball (with just a --prefix directive). I did just try the code with MPICH 1.2.7p1, and it runs fine. What

[OMPI users] quick patch to buildrpm.sh to enable building on SuSE

2006-10-23 Thread Joe Landman
+elif test -d /usr/src/packages; then +need_root=1 + rpmtopdir="/usr/src/packages" else need_root=1 rpmtopdir="/usr/src/redhat" -- Joe Landman landman |at| scalableinformatics |dot| com

Re: [O-MPI users] Re: [Beowulf] Alternative to MPI ABI

2005-04-03 Thread Joe Landman
Mark Hahn wrote: If there is an ABI then we have a fighting chance at focusing on the applications and not the ever-so-slightly-strange version of whichever flavor of MPI that they chose to use. wonderful! yes: ABI standards are good and proprietary implementations (which inherently