Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
I upgraded from torque 2.4.7 to torque version 2.5.5 and everything works as expected. I am not sure if it is how the old RPMs were compiled or if it is a version problem. In any case, I learned a lot more about Torque and OpenMPI so it is not a total waste of time and effort. Thanks for everyon

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
Yeah, the system admin is me lol.and this is a new system which I am frantically trying to work out all the bugs. Torque and MPI are my last hurdles to overcome. But I have already been through some faulty infiniband equipment, bad memory and bad drives.which is to be expected on a cluste

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Ralph Castain
mpiexec doesn't use pbsdsh (we use a TM API), but the affect is the same. Been so long since I ran on a Torque machine, though, that I honestly don't remember how to set the LD_LIBRARY_PATH on the backend. Do you have a sys admin there whom you could ask? Or you could ping the Torque list about

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
Hi. The pbsdsh tool is great. I ran an interactive qsub session (qsub -I -lnodes=2:ppn=12) and then rand the pbsdsh tool like this: [rsvancara@node164 ~]$ /usr/local/bin/pbsdsh -h node164 printenv PATH=/bin:/usr/bin LANG=C PBS_O_HOME=/home/admins/rsvancara PBS_O_LANG=en_US.UTF-8 PBS_O_LOGNAME=r

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
Ok, these are good things to check. I am going to follow through with this in the next hour after our GPFS upgrade. Thanks!!! On Mon, Mar 21, 2011 at 11:14 AM, Brock Palen wrote: > On Mar 21, 2011, at 1:59 PM, Jeff Squyres wrote: > >> I no longer run Torque on my cluster, so my Torqueology is p

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Brock Palen
On Mar 21, 2011, at 1:59 PM, Jeff Squyres wrote: > I no longer run Torque on my cluster, so my Torqueology is pretty rusty -- > but I think there's a Torque command to launch on remote nodes. tmrsh or > pbsrsh or something like that...? pbsdsh If TM is working pbsdsh should work fine. Torque+

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
Ok, Let me give this a try. Thanks for all your helpful suggestions. On Mon, Mar 21, 2011 at 11:10 AM, Ralph Castain wrote: > > On Mar 21, 2011, at 11:59 AM, Jeff Squyres wrote: > >> I no longer run Torque on my cluster, so my Torqueology is pretty rusty -- >> but I think there's a Torque comma

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
I added that temp directory in, but it does not seem to make a difference either way. It was just to illustrate that I was trying specify the temp directory in another place. I was under the impression that running mpiexec in a torque/qsub interactive session would be similar to running torque wi

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Ralph Castain
On Mar 21, 2011, at 11:59 AM, Jeff Squyres wrote: > I no longer run Torque on my cluster, so my Torqueology is pretty rusty -- > but I think there's a Torque command to launch on remote nodes. tmrsh or > pbsrsh or something like that...? pbsrsh, IIRC So run pbsrsh printenv to see the enviro

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Ralph Castain
On Mar 21, 2011, at 11:53 AM, Randall Svancara wrote: > I am not sure if there is any extra configuration necessary for torque > to forward the environment. I have included the output of printenv > for an interactive qsub session. I am really at a loss here because I > never had this much diffi

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Jeff Squyres
I no longer run Torque on my cluster, so my Torqueology is pretty rusty -- but I think there's a Torque command to launch on remote nodes. tmrsh or pbsrsh or something like that...? Try that and make sure it works. Open MPI should be using the same API as that command under the covers. I als

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Randall Svancara
I am not sure if there is any extra configuration necessary for torque to forward the environment. I have included the output of printenv for an interactive qsub session. I am really at a loss here because I never had this much difficulty making torque run with openmpi. It has been mostly a good

Re: [OMPI users] OpenMPI and Torque

2011-03-21 Thread Ralph Castain
Can you run anything under TM? Try running "hostname" directly from Torque to see if anything works at all. The error message is telling you that the Torque daemon on the remote node reported a failure when trying to launch the OMPI daemon. Could be that Torque isn't setup to forward environmen

Re: [OMPI users] OpenMPI and torque/maui -> crashing on MPI_Send()

2007-10-11 Thread Jeff Squyres
On Oct 10, 2007, at 11:09 AM, Jim Kusznir wrote: I've added: btl = ^openib to /etc/openmpi-mca-params.conf on the head node, but this doesn't seem to help. Does this need to be pushed out to all the compute nodes as well? Yes. -- Jeff Squyres Cisco Systems

Re: [OMPI users] OpenMPI and torque/maui -> crashing on MPI_Send()

2007-10-10 Thread Jim Kusznir
Hi: I've added: btl = ^openib to /etc/openmpi-mca-params.conf on the head node, but this doesn't seem to help. Does this need to be pushed out to all the compute nodes as well? The program is known to work on other clusters. I finally figured out what was happening, though: Openmpi was compiled

Re: [OMPI users] OpenMPI and torque/maui -> crashing on MPI_Send()

2007-10-10 Thread Jeff Squyres
If you do not have IB hardware, you might want to permanently disable the IB support. You can do this by setting an MCA parameter or simply removing the $prefix/lib/openmpi/mca_btl_openib.* files. This will suppress the warning that you're seeing. As for your problem with MPI_SEND, do you

Re: [OMPI users] openmpi and Torque

2007-04-14 Thread Jeff Squyres
Excellent -- thanks! I used your work as a starting point and tweaked it a bit further: - Parse the pbs-config LDFLAGS into LIBS and LDFLAGS - Look for pbs-config in both the default $PATH and the tree where -- with-tm was specified - Remove OMPI_CHECK_PACKAGE from the pbs-config-was-found code

Re: [OMPI users] openmpi and Torque

2007-04-13 Thread Bas van der Vlies
Here a new new ompi_check_tm.m4 that has all the functionality (hopefully) Regards -- * * * Bas van der Vlies e-mail: b...@sara.nl * *

Re: [OMPI users] openmpi and Torque

2007-04-12 Thread Bas van der Vlies
Brian Barrett wrote: On Apr 12, 2007, at 3:45 AM, Bas van der Vlies wrote: Jeff Squyres wrote: On Apr 11, 2007, at 8:08 AM, Bas van der Vlies wrote: The OMPI_CHECK_PACKAGE macro is a rather nasty macro that tries to reduce the replication of checking for a header then a library, then set

Re: [OMPI users] openmpi and Torque

2007-04-12 Thread Brian Barrett
On Apr 12, 2007, at 3:45 AM, Bas van der Vlies wrote: Jeff Squyres wrote: On Apr 11, 2007, at 8:08 AM, Bas van der Vlies wrote: The OMPI_CHECK_PACKAGE macro is a rather nasty macro that tries to reduce the replication of checking for a header then a library, then setting CFLAGS, LDFLAGS,

Re: [OMPI users] openmpi and Torque

2007-04-12 Thread Bas van der Vlies
Jeff Squyres wrote: On Apr 11, 2007, at 8:08 AM, Bas van der Vlies wrote: The OMPI_CHECK_PACKAGE macro is a rather nasty macro that tries to reduce the replication of checking for a header then a library, then setting CFLAGS, LDFLAGS, LIBS, and all that. There are two components that use

Re: [OMPI users] openmpi and Torque

2007-04-11 Thread Jeff Squyres
On Apr 11, 2007, at 8:08 AM, Bas van der Vlies wrote: The OMPI_CHECK_PACKAGE macro is a rather nasty macro that tries to reduce the replication of checking for a header then a library, then setting CFLAGS, LDFLAGS, LIBS, and all that. There are two components that use the TM libraries,

Re: [OMPI users] openmpi and Torque

2007-04-11 Thread Bas van der Vlies
The OMPI_CHECK_PACKAGE macro is a rather nasty macro that tries to reduce the replication of checking for a header then a library, then setting CFLAGS, LDFLAGS, LIBS, and all that. There are two components that use the TM libraries, so we have a centralized macro that sets the configuratio

Re: [OMPI users] openmpi and Torque

2007-04-06 Thread Brian W. Barrett
On Apr 6, 2007, at 10:42 AM, Bas van der Vlies wrote: On Apr 6, 2007, at 6:18 PM, Jeff Squyres wrote: On Apr 6, 2007, at 12:14 PM, Bas van der Vlies wrote: Have you run into a situation where OMPI gets the wrong flags because it's not using pbs-config? Yes, We install the torque header fil

Re: [OMPI users] openmpi and Torque

2007-04-06 Thread Bas van der Vlies
On Apr 6, 2007, at 6:18 PM, Jeff Squyres wrote: On Apr 6, 2007, at 12:14 PM, Bas van der Vlies wrote: Have you run into a situation where OMPI gets the wrong flags because it's not using pbs-config? Yes, We install the torque header files in /usr/include/torque and the libraries in /usr/l

Re: [OMPI users] openmpi and Torque

2007-04-06 Thread Jeff Squyres
On Apr 6, 2007, at 12:14 PM, Bas van der Vlies wrote: Have you run into a situation where OMPI gets the wrong flags because it's not using pbs-config? Yes, We install the torque header files in /usr/include/torque and the libraries in /usr/lib. This setup does not work with openmpi configure s

Re: [OMPI users] openmpi and Torque

2007-04-06 Thread Bas van der Vlies
On Apr 6, 2007, at 2:14 PM, Jeff Squyres wrote: On Apr 5, 2007, at 3:50 PM, Bas van der Vlies wrote: I am just try to enable PBS /Torque support in Open MPI with the -- with-tm option. My question is why the utility 'pbs-config' is not used to determine the location of the include/libra

Re: [OMPI users] openmpi and Torque

2007-04-06 Thread Jeff Squyres
On Apr 5, 2007, at 3:50 PM, Bas van der Vlies wrote: I am just try to enable PBS /Torque support in Open MPI with the -- with-tm option. My question is why the utility 'pbs-config' is not used to determine the location of the include/library directory. It is standard included in the torque