Initial testing of the grid engine has the jobs uploading, the jobs
showing in queue, and the jobs disappearing.
These are test jobs made to run for 10 minutes.. and seemed to drop once
queued.
He tested the jobs from the command line and they were fine.
[ @blade5-1-1 test]$ /opt/sge/bin/lx-amd64/qsub qsubscript
Your job 3 ("qsubscript") has been submitted
[ @blade5-1-1 test]$ qstat
[ @blade5-1-1 test]$ qstat -f
queuename qtype resv/used/tot. load_avg arch states
---------------------------------------------------------------------------------
all.q@blade5-1-1. <mailto:[email protected]> BIP 0/0/24
0.05 lx-amd64
---------------------------------------------------------------------------------
all.q@blade5-1-15. <mailto:[email protected]> BIP 0/0/24
0.00 lx-amd64
---------------------------------------------------------------------------------
all.q@blade5-1-4. <mailto:[email protected]>BIP 0/0/24
0.03 lx-amd64
---------------------------------------------------------------------------------
all.q@blade5-1-5. <mailto:[email protected]>BIP 0/0/24
0.00 lx-amd64
---------------------------------------------------------------------------------
all.q@blade5-3-1. <mailto:[email protected]> BIP 0/0/0
-NA- -NA- au
[ @blade5-1-1 test]$ qstat -j 3
Following jobs do not exist:
3
[ @blade5-1-1 test]$ which qstat
/opt/sge/bin/lx-amd64/qstat
Here is my messages file....with good information that something is
missing on install.
[root@blade5-1-1 source]# cd ..
[root@blade5-1-1 sge]# cat ./default/spool/qmaster/messages
06/02/2014 10:11:58| main|blade5-1-1|I|read job database with 0 entries
in 0 seconds
06/02/2014 10:11:58| main|blade5-1-1|E|error opening file
"/opt/sge/default/common/./sched_configuration" for reading: No such
file or directory
06/02/2014 10:11:58| main|blade5-1-1|E|error opening file
"/opt/sge/default/spool/qmaster/./sharetree" for reading: No such file
or directory
06/02/2014 10:11:58| main|blade5-1-1|I|qmaster hard descriptor limit is
set to 4096
06/02/2014 10:11:58| main|blade5-1-1|I|qmaster soft descriptor limit is
set to 1024
06/02/2014 10:11:58| main|blade5-1-1|I|qmaster will use max. 1004 file
descriptors for communication
06/02/2014 10:11:58| main|blade5-1-1|I|qmaster will accept max. 950
dynamic event clients
06/02/2014 10:11:58| main|blade5-1-1|I|starting up SGE 8.1.6 (lx-amd64)
06/02/2014 10:11:58| main|blade5-1-1|W|can't open job sequence number
file "jobseqnum": for reading: No such file or directory -- guessing
next number
06/02/2014 10:11:58| main|blade5-1-1|W|can't open ar sequence number
file "arseqnum": for reading: No such file or directory -- guessing next
number
06/02/2014 10:16:56|worker|blade5-1-1|E|adminhost
"blade5-1-1.dsg.wustl.edu" already exists
...
06/02/2014 10:19:30|worker|blade5-1-1|E|adminhost
"blade5-3-2.dsg.wustl.edu" already exists
06/06/2014 11:48:42|worker|blade5-1-1|E|invalid job object in job
submission from user "joe", commproc "qsub" on host "blade5-1-1"
06/06/2014 11:53:16|worker|blade5-1-1|E|invalid job object in job
submission from user "joe", commproc "qsub" on host "blade5-1-1"
[root@blade5-1-1 sge]# ^C
[root@blade5-1-1 sge]# cd "/opt/sge/default/common/sched_configuration
> ^C
[root@blade5-1-1 sge]# cd /opt/sge/default/common/sched_configuration
bash: cd: /opt/sge/default/common/sched_configuration: Not a directory
--
Dan Hyatt
Division of Statistical Genomics
Washington University School of Medicine
4444 Forest Park Blvd, Campus Box 8506
St. Louis, MO 63108
314 747 4767 (o)
314 473 8713 (c)
[email protected]
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users