Re: [OMPI users] mpi functions are slow when first called and become normal afterwards

2009-11-12 Thread RightCFD
> > Date: Thu, 29 Oct 2009 15:45:06 -0400 > From: Brock Palen > Subject: Re: [OMPI users] mpi functions are slow when first called and >become normal afterwards > To: Open MPI Users > Message-ID: <890cc430-68b0-4307-8260-24a6fadae...@umich.edu> > Content-Type: text/plain; charset=US-ASCII

[OMPI users] OFED-1.5rc1 with OpenMPI and IB

2009-11-12 Thread Stefan Kuhne
Hello, i try to install a small HPC-cluster for education usage. Infiniband is working as well i can ping over IB. When i try to run an MPI program i get: user@head:~/Cluster/hello$ mpirun --hostfile ../Cluster.hosts hello --

Re: [OMPI users] mpi functions are slow when first called and become normal afterwards

2009-11-12 Thread Ralph Castain
You can have OMPI wireup -all- available connections at startup of he processes with -mca mpi_preconnect_all 1 Be aware of Brock's caution. Also, note that this occurs at MPI_Init so you can adjust your timing marks accordingly. On Nov 11, 2009, at 10:04 PM, RightCFD wrote: Date: Thu, 2

Re: [OMPI users] OFED-1.5rc1 with OpenMPI and IB

2009-11-12 Thread Jeff Squyres
Can you submit all the information requested here: http://www.open-mpi.org/community/help/ On Nov 12, 2009, at 1:28 AM, Stefan Kuhne wrote: Hello, i try to install a small HPC-cluster for education usage. Infiniband is working as well i can ping over IB. When i try to run an MPI program

[OMPI users] Release date for 1.3.4?

2009-11-12 Thread John R. Cary
From http://svn.open-mpi.org/svn/ompi/branches/v1.3/NEWS I see: - Many updates and fixes to the (non-default) "sm" collective component (i.e., native shared memory MPI collective operations). Will this fix the problem noted at https://svn.open-mpi.org/trac/ompi/ticket/2043 ?? Thanks..Joh

Re: [OMPI users] Release date for 1.3.4?

2009-11-12 Thread Ralph Castain
Release should be soon after SC09 is over, I suspect. On Nov 12, 2009, at 7:35 AM, John R. Cary wrote: From http://svn.open-mpi.org/svn/ompi/branches/v1.3/NEWS I see: - Many updates and fixes to the (non-default) "sm" collective component (i.e., native shared memory MPI collective operations).

Re: [OMPI users] mpi functions are slow when first called and become normal afterwards

2009-11-12 Thread Eugene Loh
RightCFD wrote: Date: Thu, 29 Oct 2009 15:45:06 -0400 From: Brock Palen Subject: Re: [OMPI users] mpi functions are slow when first called and        become normal afterwards To: Open MPI Users Message-ID: <890cc430-68b0-4307-8260-24a6fadae...@umich

Re: [OMPI users] Release date for 1.3.4?

2009-11-12 Thread Jeff Squyres
I think Eugene will have to answer this one -- Eugeue? On Nov 12, 2009, at 6:35 AM, John R. Cary wrote: From http://svn.open-mpi.org/svn/ompi/branches/v1.3/NEWS I see: - Many updates and fixes to the (non-default) "sm" collective component (i.e., native shared memory MPI collective operati

Re: [OMPI users] Release date for 1.3.4?

2009-11-12 Thread Eugene Loh
Jeff Squyres wrote: I think Eugene will have to answer this one -- Eugeue? On Nov 12, 2009, at 6:35 AM, John R. Cary wrote: From http://svn.open-mpi.org/svn/ompi/branches/v1.3/NEWS I see: - Many updates and fixes to the (non-default) "sm" collective component (i.e., native shared memory MP

Re: [OMPI users] checkpoint opempi-1.3.3+sge62

2009-11-12 Thread Sergio Díaz
Hi Josh, You were right. The main problem was the /tmp. SGE uses a scratch directory in which the jobs have temporary files. Setting TMPDIR to /tmp, checkpoint works! However, when I try to restart it... I got the following error (see ERROR1). Option -v agrees these lines (see ERRO2). I was

Re: [OMPI users] users Digest, Vol 1401, Issue 2

2009-11-12 Thread Jeff Squyres
It looks like your executable is explicitly calling MPI_ABORT in the CmiAbort function -- perhaps in response to something happening in the namd or CmiHandleMessage functions. The next logical step would likely be to look in those routines and see why MPI_ABORT/CmiAbort would be invoked.

Re: [OMPI users] mpi functions are slow when first called and become normal afterwards

2009-11-12 Thread Gus Correa
Eugene Loh wrote: RightCFD wrote: Date: Thu, 29 Oct 2009 15:45:06 -0400 From: Brock Palen mailto:bro...@umich.edu>> Subject: Re: [OMPI users] mpi functions are slow when first called and become normal afterwards To: Open MPI Users mailto:us...@open-mpi.org>> Messa

Re: [OMPI users] users Digest, Vol 1403, Issue 4

2009-11-12 Thread RightCFD
Thanks for all your inputs. It is good to know this initial latency is an expected behavior and the workaround by using one dummy iteration before timing is started. I did not notice this because my older parallel CFD code runs a large number of time steps and the initial latency was compensated

Re: [OMPI users] Problem with mpirun -preload-binary option

2009-11-12 Thread Qing Pang
Now that I have passwordless-ssh set up both directions, and verified working - I still have the same problem. I'm able to run ssh/scp on both master and client nodes - (at this point, they are pretty much the same), without being asked for password. And mpirun works fine if I have the executabl

Re: [OMPI users] Release date for 1.3.4?

2009-11-12 Thread Douglas Guptill
Hello Eugene: On Thu, Nov 12, 2009 at 07:20:08AM -0800, Eugene Loh wrote: > Jeff Squyres wrote: > >> I think Eugene will have to answer this one -- Eugeue? >> >> On Nov 12, 2009, at 6:35 AM, John R. Cary wrote: >> >>> From http://svn.open-mpi.org/svn/ompi/branches/v1.3/NEWS I see: >>> >>> - Many u

[OMPI users] Come see us at SC09!

2009-11-12 Thread Jeff Squyres
Several of us from the Open MPI crew will be at SC09; if you're coming, be sure to stop by and say hello! ...and use the SC09 Fist Bump(tm), of course (http://www.linux-mag.com/id/7608). - I'll be hanging in and around the Cisco booth (#1847), but also giving various other booth talks arou

[OMPI users] oob mca question

2009-11-12 Thread Aaron Knister
Dear List, I'm having a really weird issue with openmpi - version 1.3.3 (version 1.2.8 doesn't seem to exhibit this behavior). Essentially when I start jobs from the cluster front-end node using mpirun, mpirun sits idle for up to a minute and a half (for 30 nodes) before running the comma

Re: [OMPI users] oob mca question

2009-11-12 Thread Ralph Castain
That is indeed the expected behavior, and your solution is the correct one. The orted has no way of knowing which interface mpirun can be reached on, so it has no choice but to work its way through the available ones. Because of the ordering in the way the OS reports the interfaces, it is

Re: [OMPI users] oob mca question

2009-11-12 Thread Aaron Knister
Thanks! I appreciate the response. On Nov 12, 2009, at 9:54 PM, Ralph Castain wrote: That is indeed the expected behavior, and your solution is the correct one. The orted has no way of knowing which interface mpirun can be reached on, so it has no choice but to work its way through the a