Re: [OMPI users] openmpi.ld.conf file

2010-03-31 Thread Jeff Squyres
On Mar 31, 2010, at 5:25 PM, Abhishek Gupta wrote: > I am trying to find out the location of openmpi.ld.conf file for my > openmpi/openmpi-libs. Can someone tell me where that file is placed? There is no openmpi.ld.conf in the official Open MPI distribution. Are you installing Open MPI from a

Re: [OMPI users] ompi-checkpoint --term

2010-03-31 Thread Fernando Lemos
On Wed, Mar 31, 2010 at 7:39 PM, Addepalli, Srirangam V wrote: > Hello All. > I am trying to checkpoint a mpi application that has been started using the > follwong mpirun command > > mpirun -am ft-enable-cr -np 8 pw.x  < Ge46.pw.in > Ge46.ph.out > >

[OMPI users] ompi-checkpoint --term

2010-03-31 Thread Addepalli, Srirangam V
Hello All. I am trying to checkpoint a mpi application that has been started using the follwong mpirun command mpirun -am ft-enable-cr -np 8 pw.x < Ge46.pw.in > Ge46.ph.out ompi-checkpoint 31396 ( Works) How ever when i try to terminate the process ompi-checkpoint --term 31396 it never

Re: [OMPI users] Hide Abort output

2010-03-31 Thread David Singleton
Yes, Dick has isolated the issue - novice users often believe Open MPI (not their application) had a problem. Anything along the lines he suggests can only help. David On 04/01/2010 01:12 AM, Richard Treumann wrote: I do not know what the OpenMPI message looks like or why people want to

[OMPI users] openmpi.ld.conf file

2010-03-31 Thread Abhishek Gupta
Hi, I am trying to find out the location of openmpi.ld.conf file for my openmpi/openmpi-libs. Can someone tell me where that file is placed? Thanks, Abhi.

Re: [OMPI users] openMPI on Xgrid

2010-03-31 Thread Jeff Squyres
Yes, good idea. SGE is a fine scheduler; it's actively supported by Open MPI. On Mar 31, 2010, at 11:21 AM, Cristobal Navarro wrote: > and how about Sun Grid Engine + openMPI, good idea?? > > im asking because i just checked out that Mathematica 7 supports cluster > integration with SGE which

Re: [OMPI users] openMPI on Xgrid

2010-03-31 Thread Cristobal Navarro
and how about Sun Grid Engine + openMPI, good idea?? im asking because i just checked out that Mathematica 7 supports cluster integration with SGE which will be a plus apart from our C programs. Cristobal On Tue, Mar 30, 2010 at 4:06 PM, Gus Correa wrote: > Craig

Re: [OMPI users] Segmentation fault (11)

2010-03-31 Thread Joshua Hursey
That is interesting. I cannot think of any reason why this might be causing a problem just in Open MPI. popen() is similar to fork()/system() so you have to be careful with interconnects that do not play nice with fork(), like openib. But since it looks like you are excluding openib, this

Re: [OMPI users] Hide Abort output

2010-03-31 Thread Richard Treumann
I do not know what the OpenMPI message looks like or why people want to hide it. It should be phrased to avoid any implication of a problem with OpenMPI itself. How about something like this which: "The application has called MPI_Abort. The application is terminated by OpenMPI as the

Re: [OMPI users] kernel 2.6.23 vs 2.6.24 - communication/wait times

2010-03-31 Thread Oliver Geisler
I have tried up to kernel 2.6.33.1 on both architectures (Core2 Duo and I5) with the same results. The "slow" results are also appearing for distribution of processes on the 4 cores one single node. We use btl = self,sm,tcp in /etc/openmpi/openmpi-mca-params.conf Distributing several process to

Re: [OMPI users] strange problem with OpenMPI + rankfile + Intelcompiler 11.0.074 + centos/fedora-12

2010-03-31 Thread Jeff Squyres
On Mar 24, 2010, at 12:49 AM, Anton Starikov wrote: > Two different OSes: centos 5.4 (2.6.18 kernel) and Fedora-12 (2.6.32 kernel) > Two different CPUs: Opteron 248 and Opteron 8356. > > same binary for OpenMPI. Same binary for user code (vasp compiled for older > arch) Are you sure that the

Re: [OMPI users] kernel 2.6.23 vs 2.6.24 - communication/wait times

2010-03-31 Thread Jeff Squyres
I have a very dim recollection of some kernel TCP issues back in some older kernel versions -- such issues affected all TCP communications, not just MPI. Can you try a newer kernel, perchance? On Mar 30, 2010, at 1:26 PM, wrote: > Hello List, > > I

Re: [OMPI users] OPEN_MPI macro for mpif.h?

2010-03-31 Thread Jeff Squyres
On Mar 29, 2010, at 4:10 PM, Martin Bernreuther wrote: > looking at the Open MPI mpi.h include file there's a preprocessor macro > OPEN_MPI defined, as well as e.g. OMPI_MAJOR_VERSION, OMPI_MINOR_VERSION > and OMPI_RELEASE_VERSION. version.h e.g. also defines OMPI_VERSION > This seems to be

Re: [OMPI users] Problem in remote nodes

2010-03-31 Thread Jeff Squyres
On Mar 30, 2010, at 4:28 PM, Robert Collyer wrote: > I changed the SELinux config to permissive (log only), and it didn't > change anything. Back to the drawing board. I'm afraid I have no expereince with SELinux -- I don't know what it restricts. Generally, you need to be able to run

Re: [OMPI users] Help om Openmpi

2010-03-31 Thread Jeff Squyres (jsquyres)
Yes, you need to install open mpi on all nodes and you need to be able to login to each node without being prompted for a password. Also, not that v1.2.7 is pretty ancient. If you're juist starting with open mpi, can you upgrade to the latest version? -jms Sent from my PDA. No type good.

Re: [OMPI users] Hide Abort output

2010-03-31 Thread Jeff Squyres (jsquyres)
At present there is no such feature, but it should not be hard to add. Can you guys be a little more specific about exactly what you are seeing and exactly what you want to see? (And what version you're working with - I'll caveat my discussion that this may be a 1.5-and-forward thing) -jms

Re: [OMPI users] Problem in remote nodes

2010-03-31 Thread Jeff Squyres (jsquyres)
Those are normal ssh messages, I think - an ssh session may try mulktiple auth methods before one succeeds. You're absolutely sure that there's no firewalling software and selinux is disabled? Ompi is behaving as if it is trying to communicate and failing (e.g., its hanging while trying to

Re: [OMPI users] Problem in remote nodes

2010-03-31 Thread uriz . 49949
I've been checking the /var/log/messages on the compute node and there is nothing new after executing ' mpirun --host itanium2 -np 2 helloworld.out', but in the /var/log/messages file on the remote node it appears the following messages, nothing about unix_chkpwd. Mar 31 11:56:51 itanium2

Re: [OMPI users] Hide Abort output

2010-03-31 Thread David Singleton
I have to say this is a very common issue for our users. They repeatedly report the long Open MPI MPI_Abort() message in help queries and fail to look for the application error message about the root cause. A short MPI_Abort() message that said "look elsewhere for the real error message" would

[OMPI users] Hide Abort output

2010-03-31 Thread Yves Caniou
Dear all, I am using the MPI_Abort() command in a MPI program. I would like to not see the note explaining that the command caused Open MPI to kill all the jobs and so on. I thought that I could find a --mca parameter, but couldn't grep it. The only ones deal with the delay and printing more

[OMPI users] Help om Openmpi

2010-03-31 Thread Huynh Thuc Cuoc
Dear all, I had install my cluster which the configuration as following: - headnode : + linux CenOS 5.4, 4 CPUs, 3G RAM + sun gridengine sge6.0u12. The headnode is admin and submit node too. + Openmpi 1.2.9. In the installation openmpi :.configure --prefix=/opt/openmpi --with-sge

Re: [OMPI users] Best way to reduce 3D array

2010-03-31 Thread Ricardo Reis
On Tue, 30 Mar 2010, Gus Correa wrote: Salve Ricardo Reis! Como vai a Radio Zero? :) busy, busy, busy. we are preparing to celebrate Yuri's Night, April the 12th! Doesn't this serialize the I/O operation across the processors, whereas MPI_Gather followed by rank_0 I/O may perhaps move