Hi Reuti,

The mpi-ring is the test program from Intel which sends msg in a ring. If I
run the program manually using mpirun, it works just fine. The problem is
only when I use OSG to submit jobs with more than 16 slots (each node
consist 16 processors).

Tung

On Fri, Nov 14, 2014 at 12:44 AM, Reuti <[email protected]> wrote:

> Hi,
>
> Am 13.11.2014 um 18:09 schrieb Doan Trung Tung:
>
> > I have OSG installed as a role on Rock cluster installation for a
> cluster of 16 nodes, each node has 16 processors. I'm new with OSG so I let
> everything in default.
> > When I submit mpi-ring example using qsub, if the number of slots is
> less than or equal to 16, all threads are run on a single random node. So I
> increase the number of slots to a number that larger than 16 hoping that
> they will run on different nodes, but actually they get errors.
> >
> > Here is the script I used to submit mpi-ring:
> > #!/bin/bash
> >
> > #$ -cwd
> > #$ -S /bin/bash
> > #$ -j y
> > #$ -pe orte 8
> > mpirun $HOME/testmpi/mpi-ring
>
> What mpi-ring in detail - where is the source resp. from what MPI library?
>
> -- Reuti
>
>
> > (orte is one of 4 default parallel environments the system has)
> >
> > If I change the number of slots to 17 instead of 8, I get this error:
> > APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal
> 11)
> > also a stranged file was produced: core.xxxx
> >
> > Why do I cannot submit more thatn 16 slots?
> >
> > Thanks.
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
>
>


-- 
Doan Trung Tung, PhD.
Researcher, HPC - Hanoi University of Technologies
Mobile: 0914720240
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to