Douglas Guptill wrote:
On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote:
Noafraid not. Things work pretty well, but there are places
where things just don't mesh. Sub-node allocation in particular is
an issue as it implies binding, and slurm and ompi have conflicting
methods.
+1.
FWIW, Open MPI works pretty well with SLURM; I use it back here at Cisco for
all my testing. That one particular option you're testing doesn't seem to
work, but all in all, the integration works fairly well.
On Jul 7, 2010, at 3:27 PM, Ralph Castain wrote:
> You'll get passionate advocat
You'll get passionate advocates from all the various resource managers - there
really isn't a right/wrong answer. Torque is more widely used, but any of them
will do.
None are perfect, IMHO.
On Jul 7, 2010, at 1:16 PM, Douglas Guptill wrote:
> On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Ca
On Wed, Jul 07, 2010 at 12:37:54PM -0600, Ralph Castain wrote:
> Noafraid not. Things work pretty well, but there are places
> where things just don't mesh. Sub-node allocation in particular is
> an issue as it implies binding, and slurm and ompi have conflicting
> methods.
>
> It all can get
On Jul 7, 2010, at 2:37 PM, Ralph Castain wrote:
> Noafraid not. Things work pretty well, but there are places where things
> just don't mesh. Sub-node allocation in particular is an issue as it implies
> binding, and slurm and ompi have conflicting methods.
>
> It all can get worked out, b
Noafraid not. Things work pretty well, but there are places where things
just don't mesh. Sub-node allocation in particular is an issue as it implies
binding, and slurm and ompi have conflicting methods.
It all can get worked out, but we have limited time and nobody cares enough to
put in t
Alas, I'm sorry to hear that! I had hoped (assumed?) that the slurm
team would be hand-in-glove with the OMPI team in making sure the
interface between the two is smooth. :(
David
On Wed, Jul 7, 2010 at 11:09 AM, Ralph Castain wrote:
> Ah, if only it were that simple. Slurm is a very difficult
Ah, if only it were that simple. Slurm is a very difficult beast to interface
with, and I have yet to find a single, reliable marker across the various slurm
releases to detect options we cannot support.
On Jul 7, 2010, at 11:59 AM, David Roundy wrote:
> On Wed, Jul 7, 2010 at 10:26 AM, Ralph
On Wed, Jul 7, 2010 at 10:26 AM, Ralph Castain wrote:
> I'm afraid the bottom line is that OMPI simply doesn't support core-level
> allocations. I tried it on a slurm machine available to me, using our devel
> trunk as well as 1.4, with the same results.
>
> Not sure why you are trying to run th
I'm afraid the bottom line is that OMPI simply doesn't support core-level
allocations. I tried it on a slurm machine available to me, using our devel
trunk as well as 1.4, with the same results.
Not sure why you are trying to run that way, but I'm afraid you can't do it
with OMPI.
On Jul 6, 20
On Tue, Jul 6, 2010 at 12:31 PM, Ralph Castain wrote:
> Thanks - that helps.
>
> As you note, the issue is that OMPI doesn't support the core-level allocation
> options of slurm - never has, probably never will. What I found interesting,
> though, was that your envars don't anywhere indicate tha
Thanks - that helps.
As you note, the issue is that OMPI doesn't support the core-level allocation
options of slurm - never has, probably never will. What I found interesting,
though, was that your envars don't anywhere indicate that this is what you
requested. I don't see anything there that w
Ah yes,
It's the versions of each that are packaged in debian testing, which
are openmpi 1.4.1 and slurm 2.1.9.
David
On Tue, Jul 6, 2010 at 11:38 AM, Ralph Castain wrote:
> It would really help if you told us what version of OMPI you are using, and
> what version of SLURM.
>
>
> On Jul 6, 201
It would really help if you told us what version of OMPI you are using, and
what version of SLURM.
On Jul 6, 2010, at 12:16 PM, David Roundy wrote:
> Hi all,
>
> I'm running into trouble running an openmpi job under slurm. I
> imagine the trouble may be in my slurm configuration, but since th
For what it's worth, the slurm environment variables are:
SLURM_JOBID=2817
SLURM_JOB_NUM_NODES=1
SLURM_TASKS_PER_NODE=1
SLURM_TOPOLOGY_ADDR_PATTERN=node
SLURM_PRIO_PROCESS=0
SLURM_JOB_CPUS_PER_NODE=2
SLURM_JOB_NAME=submit.sh
SLURM_PROCID=0
SLURM_CPUS_ON_NODE=2
SLURM_NODELIST=node02
SLURM_NNODES=1
Hi all,
I'm running into trouble running an openmpi job under slurm. I
imagine the trouble may be in my slurm configuration, but since the
error itself involves mpirun crashing, I thought I'd best ask here
first. The error message I get is:
--
16 matches
Mail list logo