I recently created a website for myself, and threw this documentation up
there, where it might be a little easier to read:
http://prentice.bisbal.co/hpc/sge/cannot_run_on_host
--
Prentice
On 02/10/2012 09:52 AM, Prentice Bisbal wrote:
Jerome,
I had a similar problem a couple of years ago.
We recently experience this error on GE 6.2u5.
Chris
On 02/16/2012 09:08 AM, Prentice Bisbal wrote:
On 02/15/2012 07:03 PM, Dave Love wrote:
Prentice Bisbalprent...@ias.edu writes:
Jerome,
I had a similar problem a couple of years ago. I would get this error:
cannot run on host
Prentice Bisbal prent...@ias.edu writes:
Jerome,
I had a similar problem a couple of years ago. I would get this error:
cannot run on host node64.aurora until clean up of an previous run has
finished
Is this known to happen with a recent version, or just old ones?
Jerome,
I had a similar problem a couple of years ago. I would get this error:
cannot run on host node64.aurora until clean up of an previous run has
finished
(Aurora's my cluster's name, so I use that as my top-level domain on my
cluster nodes)
Fixing this problem is a bit tedious.
hi
the old thread DT did suggest to take down the entire cluster(SGE) so to
cleanup the EH_reschedule_unknown_list
regards
On 2/10/2012 9:52 AM, Prentice Bisbal wrote:
Jerome,
I had a similar problem a couple of years ago. I would get this error:
cannot run on host node64.aurora until
Dera all
I have the SGE version GE 6.2u2_1 on a Rocks cluster.
Since few days, a node refuse to run a job. using qstat -j jid, i
notice this line a the end of the output:
cannot run on host compute-2-15.local until clean up of an previous
run has finished
I revise on the node 2-15, but the
check the CELL/spool/ directory of the qmaster and nodes
On 2/9/2012 12:51 PM, Jerome wrote:
Dera all
I have the SGE version GE 6.2u2_1 on a Rocks cluster.
Since few days, a node refuse to run a job. using qstat -j jid, i
notice this line a the end of the output:
cannot run on host
Dear Hung-Sheng
Thanks for your quick reply.
I've check on the CELL/spool/ on the node, and the jobs directory is empty.
On the master node, the jobs directory just contain the number of files
corresponding to the jobs running o qiting to be run.
Should i check in a specific directory?
i am not sure, but if you look the source code you can see this error msg come
from scheduler
may be try to restart the qmaster when the system does not have jobs running or
queuing
sorry
Sent from my iPad
On Feb 9, 2012, at 19:07, Jerome jer...@ibt.unam.mx wrote:
Dear Hung-Sheng
Thanks
one more things
may be increase the debug level so one can get more info:-)
Sent from my iPad
On Feb 9, 2012, at 21:08, Hung-Sheng Tsao (laoTsao) laot...@gmail.com wrote:
i am not sure, but if you look the source code you can see this error msg
come from scheduler
may be try to restart the
10 matches
Mail list logo