Re: [OMPI users] trouble using openmpi under slurm

2010-07-08 Thread Gus Correa
Douglas Guptill wrote: On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote: Noafraid not. Things work pretty well, but there are places where things just don't mesh. Sub-node allocation in particular is an issue as it implies binding, and slurm and ompi have conflicting methods.

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Jeff Squyres
+1. FWIW, Open MPI works pretty well with SLURM; I use it back here at Cisco for all my testing. That one particular option you're testing doesn't seem to work, but all in all, the integration works fairly well. On Jul 7, 2010, at 3:27 PM, Ralph Castain wrote: > You'll get passionate advocat

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Ralph Castain
You'll get passionate advocates from all the various resource managers - there really isn't a right/wrong answer. Torque is more widely used, but any of them will do. None are perfect, IMHO. On Jul 7, 2010, at 1:16 PM, Douglas Guptill wrote: > On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Ca

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Douglas Guptill
On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote: > Noafraid not. Things work pretty well, but there are places > where things just don't mesh. Sub-node allocation in particular is > an issue as it implies binding, and slurm and ompi have conflicting > methods. > > It all can get

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Jeff Squyres
On Jul 7, 2010, at 2:37 PM, Ralph Castain wrote: > Noafraid not. Things work pretty well, but there are places where things > just don't mesh. Sub-node allocation in particular is an issue as it implies > binding, and slurm and ompi have conflicting methods. > > It all can get worked out, b

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Ralph Castain
Noafraid not. Things work pretty well, but there are places where things just don't mesh. Sub-node allocation in particular is an issue as it implies binding, and slurm and ompi have conflicting methods. It all can get worked out, but we have limited time and nobody cares enough to put in t

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread David Roundy
Alas, I'm sorry to hear that! I had hoped (assumed?) that the slurm team would be hand-in-glove with the OMPI team in making sure the interface between the two is smooth. :( David On Wed, Jul 7, 2010 at 11:09 AM, Ralph Castain wrote: > Ah, if only it were that simple. Slurm is a very difficult

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Ralph Castain
Ah, if only it were that simple. Slurm is a very difficult beast to interface with, and I have yet to find a single, reliable marker across the various slurm releases to detect options we cannot support. On Jul 7, 2010, at 11:59 AM, David Roundy wrote: > On Wed, Jul 7, 2010 at 10:26 AM, Ralph

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread David Roundy
On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain wrote: > I'm afraid the bottom line is that OMPI simply doesn't support core-level > allocations. I tried it on a slurm machine available to me, using our devel > trunk as well as 1.4, with the same results. > > Not sure why you are trying to run th

Re: [OMPI users] trouble using openmpi under slurm

2010-07-07 Thread Ralph Castain
I'm afraid the bottom line is that OMPI simply doesn't support core-level allocations. I tried it on a slurm machine available to me, using our devel trunk as well as 1.4, with the same results. Not sure why you are trying to run that way, but I'm afraid you can't do it with OMPI. On Jul 6, 20

Re: [OMPI users] trouble using openmpi under slurm

2010-07-06 Thread David Roundy
On Tue, Jul 6, 2010 at 12:31 PM, Ralph Castain wrote: > Thanks - that helps. > > As you note, the issue is that OMPI doesn't support the core-level allocation > options of slurm - never has, probably never will. What I found interesting, > though, was that your envars don't anywhere indicate tha

Re: [OMPI users] trouble using openmpi under slurm

2010-07-06 Thread Ralph Castain
Thanks - that helps. As you note, the issue is that OMPI doesn't support the core-level allocation options of slurm - never has, probably never will. What I found interesting, though, was that your envars don't anywhere indicate that this is what you requested. I don't see anything there that w

Re: [OMPI users] trouble using openmpi under slurm

2010-07-06 Thread David Roundy
Ah yes, It's the versions of each that are packaged in debian testing, which are openmpi 1.4.1 and slurm 2.1.9. David On Tue, Jul 6, 2010 at 11:38 AM, Ralph Castain wrote: > It would really help if you told us what version of OMPI you are using, and > what version of SLURM. > > > On Jul 6, 201

Re: [OMPI users] trouble using openmpi under slurm

2010-07-06 Thread Ralph Castain
It would really help if you told us what version of OMPI you are using, and what version of SLURM. On Jul 6, 2010, at 12:16 PM, David Roundy wrote: > Hi all, > > I'm running into trouble running an openmpi job under slurm. I > imagine the trouble may be in my slurm configuration, but since th

Re: [OMPI users] trouble using openmpi under slurm

2010-07-06 Thread David Roundy
For what it's worth, the slurm environment variables are: SLURM_JOBID=2817 SLURM_JOB_NUM_NODES=1 SLURM_TASKS_PER_NODE=1 SLURM_TOPOLOGY_ADDR_PATTERN=node SLURM_PRIO_PROCESS=0 SLURM_JOB_CPUS_PER_NODE=2 SLURM_JOB_NAME=submit.sh SLURM_PROCID=0 SLURM_CPUS_ON_NODE=2 SLURM_NODELIST=node02 SLURM_NNODES=1

[OMPI users] trouble using openmpi under slurm

2010-07-06 Thread David Roundy
Hi all, I'm running into trouble running an openmpi job under slurm. I imagine the trouble may be in my slurm configuration, but since the error itself involves mpirun crashing, I thought I'd best ask here first. The error message I get is: --