Update:
Going through the spool messages of comp065 I found this message:
10/21/2014 14:48:34| main|comp065|E|can't start job "155": can't open file
/opt/gridengine/default/spool/comp065/active_jobs/155.1/pe_hostfile: No
such file or
Note that spool directory is a mounted NFS directory.
I tried
Dear all,
I am currently configuring Grid Engine on a fresh install of Rocks cluster.
I have 3 compute nodes. Whenever I submit any job it only runs on 1 of the
nodes and the other nodes' jobs halt in 't' state.
Running 'qconf -tsm', I get the following log:
Tue Oct 21 14:36:49 2014|
> "schedd_job_info" is switched on then? But even if switched off it should
> show up in `qalter -w p`.
Yes, it is on.
> And 24 slots per machine then - `qstat -g c ` reveals the slots as being free
> too?
A good question: it reveals that from 72 are now (!) 70 are free and no one is
used. Wh
Am 21.10.2014 um 13:45 schrieb Winkler, Ursula (ursula.wink...@uni-graz.at):
>> The qsub man page states that -w p and -w v don't take into account load
>> values. Possibly the job is requesting a complex whose value is determined
>> by a load sensor and the returned value is not suitable but n
> The qsub man page states that -w p and -w v don't take into account load
> values. Possibly the job is requesting a complex whose value is determined
> by a load sensor and the returned value is not suitable but not causing an
> alarm.
Should not "qstat -j " list the shortage of a complex?
On Tue, 21 Oct 2014 10:45:30 +
"Winkler, Ursula (ursula.wink...@uni-graz.at)"
wrote:
> Hi Reuti,
>
> no - and there is no (other than reputed slots) resource shortage. And no
> host is in error state.
>
> Ursula
>
>
> -Ursprüngliche Nachricht-
> Von: Reuti [mailto:re...@staff.un
The allocation rule is "$fill_up" and not "$pe_slots". So that should be ok.
Ursula
-Ursprüngliche Nachricht-
Von: users-boun...@gridengine.org [mailto:users-boun...@gridengine.org] Im
Auftrag von Simon Andrews
Gesendet: Dienstag, 21. Oktober 2014 13:05
An: Gridengine Users Group
Betreff
What is your allocation rule for the mpios parallel environment (qconf -sp
mpios)? Could it be that the allocation says that the slots all have to be on
the same physical node, and no single node has more than 64 slots available?
-Original Message-
From: users-boun...@gridengine.org [m
Hi Reuti,
no - and there is no (other than reputed slots) resource shortage. And no host
is in error state.
Ursula
-Ursprüngliche Nachricht-
Von: Reuti [mailto:re...@staff.uni-marburg.de]
Gesendet: Dienstag, 21. Oktober 2014 12:25
An: Gridengine Users Group
Cc: Winkler, Ursula (ursula
Hi,
Am 21.10.2014 um 11:21 schrieb Ursula Winkler:
> Hi gridengine members,
>
> For now I ran out of ideas with an annoying problem:
>
> A job with 72 slots does not start because of "qstat -j " tells
> "cannot run in PE "mpios" because it only offers 64 slots", but there are 72
> free ("qalt
Hi gridengine members,
For now I ran out of ideas with an annoying problem:
A job with 72 slots does not start because of "qstat -j " tells
"cannot run in PE "mpios" because it only offers 64 slots", but there
are 72 free ("qalter -w p and "qalter -w -v " tells
"verification: found possible
11 matches
Mail list logo