Hi Bernard,
Now even mpi jobs just sit there, along with pvm jobs. both have to be qrun
thru root.
Whats going on ? I mean can torque/maui change behaviour on their on over
time.
My server is resonably secure. I highly doubt any security breach or
something.
Btw what could be quick short term solutions other than me sitting on the
terminal qrun'ing users jobs. can qmgr thing be useful. can u suggest a
quick fix (syntax etc..)
---
Regards
From: "Bernard Li" <[EMAIL PROTECTED]>
To: "X Y" <[EMAIL PROTECTED]>, <[email protected]>
Subject: RE: [Oscar-users] PVM jobs need to be forced with qrun to run !
Date: Wed, 15 Feb 2006 22:10:18 -0800
So does the job just sit there in the queue and do not run? Do the logs
(TORQUE, MAUI) say anything?
Cheers,
Bernard
________________________________
From: [EMAIL PROTECTED] on behalf of X Y
Sent: Wed 15/02/2006 02:43
To: [email protected]
Subject: [Oscar-users] PVM jobs need to be forced with qrun to run !
Hi,
My cluster specs/config:
Oscar version: 4.1
OS : Redhat 9 (x86)
with Default Oscar installation
Compute Nodes: 32 nodes
Im able to run my mpi jobs fine. a soon as I qsub my mpi-jobs they get
que-ed
up in the default que (workq) & run.
but my pvm jobs wont run. unless I su to root & manually (forcefully)
qrun
them. So I
doubt the problem is related to resources_default.nodes being set as mpi
ones are running fine.
(btw its set with the qmgr right?). the pvm pbsjobscript is attached
below
just in case.
Any suggestions/ideas are welcome.
Regards
--
SD.
pvmpbscript:
[EMAIL PROTECTED] server_priv]# cat /home/oscartst/pbs_script.pvm
************************************
#!/bin/sh
### Job name
#PBS -N pvmtest
### Output files
#PBS -o pvmtest.out
#PBS -e pvmtest.err
### Queue name
#PBS -q workq
### Script Commands
cd $PBS_O_WORKDIR
# generate pvm nodes file
echo "* ep=$PBS_O_WORKDIR wd=$PBS_O_WORKDIR" > pvm_nodes
cat $PBS_NODEFILE >> pvm_nodes
# start pvm daemon & wait for slave daemons to start up
pvmd pvm_nodes &
#sleep 10
# run job
p=`pwd`
cp master1.c slave1.c /tmp
cd /tmp
gcc -I$PVM_ROOT/include master1.c -L$PVM_ROOT/lib/$PVM_ARCH -lpvm3 -o
master1
gcc -I$PVM_ROOT/include slave1.c -L$PVM_ROOT/lib/$PVM_ARCH -lpvm3 -o
slave1
cp master1 slave1 $p
cd $p
./master1
# wait again to make sure everyone's finished
# then kill master pvm daemon
#sleep 5
/usr/bin/killall -TERM pvmd3
# get rid of lock files & nodes file
uid=`id -u`
tail +2 $PBS_NODEFILE > pvm_nodes
/bin/rm -f /tmp/pvm?.$uid
crm pvm_nodes:/tmp/pvmd.$uid > /dev/null 2>&1
crm pvm_nodes:/tmp/pvml.$uid > /dev/null 2>&1
/bin/rm -f pvm_nodes
exit
*************************************
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log
files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users
_________________________________________________________________
Dont just search. Find. Check out the new MSN Search!
http://search.msn.click-url.com/go/onm00200636ave/direct/01/
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Oscar-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oscar-users