Hi Reuti, The mpi-ring is the test program from Intel which sends msg in a ring. If I run the program manually using mpirun, it works just fine. The problem is only when I use OSG to submit jobs with more than 16 slots (each node consist 16 processors).
Tung On Fri, Nov 14, 2014 at 12:44 AM, Reuti <[email protected]> wrote: > Hi, > > Am 13.11.2014 um 18:09 schrieb Doan Trung Tung: > > > I have OSG installed as a role on Rock cluster installation for a > cluster of 16 nodes, each node has 16 processors. I'm new with OSG so I let > everything in default. > > When I submit mpi-ring example using qsub, if the number of slots is > less than or equal to 16, all threads are run on a single random node. So I > increase the number of slots to a number that larger than 16 hoping that > they will run on different nodes, but actually they get errors. > > > > Here is the script I used to submit mpi-ring: > > #!/bin/bash > > > > #$ -cwd > > #$ -S /bin/bash > > #$ -j y > > #$ -pe orte 8 > > mpirun $HOME/testmpi/mpi-ring > > What mpi-ring in detail - where is the source resp. from what MPI library? > > -- Reuti > > > > (orte is one of 4 default parallel environments the system has) > > > > If I change the number of slots to 17 instead of 8, I get this error: > > APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal > 11) > > also a stranged file was produced: core.xxxx > > > > Why do I cannot submit more thatn 16 slots? > > > > Thanks. > > _______________________________________________ > > users mailing list > > [email protected] > > https://gridengine.org/mailman/listinfo/users > > -- Doan Trung Tung, PhD. Researcher, HPC - Hanoi University of Technologies Mobile: 0914720240
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
