I'm running this on a development cluster and testing implementing
h_rt limits and job status email functionality.
Job 187 (mpirun) Aborted
Exit Status = 0
Signal = KILL
User = lanew
Queue= short.q@cscld1-0-2
Host = cscld1-0-2.local
Start Time
SGE is fine on 1GB fabrics and I don't know of anyone who uses 10Gb for
SGE unless it's a combined network fabric that is carrying storage and
application traffic along with SGE traffic on the same links. Or if you
are running all new stuff with 10Gb for everything and maybe a 1GB NIC
held ba
If a cluster is running on a relatively slow speed networking backbone (say
gigabit ethernet or
10 Gib ethernet as opposed to inifiniband), is there any commonly accepted
point at which increasing the number
of nodes in a queue negatively affects the performance of the queue? Is there
any genera
Reuti,
1.
The exechost isn't the head node is it? We've always referred to our SGE
clusters as having
three types of nodes: submit nodes, compute nodes and head nodes.
Compute ring is an OpenMPI term for the slots to which processes for the job
are dispatched,
but I meant the compute nodes that
Hi,
Am 23.09.2015 um 21:18 schrieb Lane, William:
> Reuti,
>
> 1.
> If more than one compute node takes part in the compute ring, how does one
> determine
> which one is the exechost?
What do you mean by compute ring - a parallel job?
The exechost is the one where the job script is executed.