On Thu, Feb 06, 2003 at 05:46:54PM -0600, Jeremy Enos wrote: > It is possible, tho I have my doubts if the DB is even the problem. You > can try it though... When the server is NOT running... > rm /var/spool/pbs/server_priv/serverdb
Thanks for the help so far. You're doubts are correct. I removed the db,
and it still sucks up all the memory. After looking through the serverlogs,
there are a number of strange errors, but none that indicate more than I
already know. One job was terminated because of resources:
02/06/2003 11:33:01;0010;PBS_Server;Job;465.vonkarman;Exit_status=143
resources_used.cput=00:00:18 resources_used.mem=3592kb
resources_used.vmem=8360kb resources_used.walltime=211:10:02
but I do not believe it is the job's fault. It has been running for far
longer before, and on a number of different clusters. The professors who
are running the code assure me that the code is quite stable. I have my
doubts, but I don't think its the code in this situation. The other errors
are Protocol failures on various nodes, and this one:
02/06/2003 11:35:24;0080;PBS_Server;Req;req_reject;Reject reply code=15018,
aux= 0, type=9, from root@vonkarman
and the last error produced before endless "log opened" lines is this:
02/06/2003 11:58:59;0100;PBS_Server;Req;;Type 57 request received from
pbs_mom@n 58.cfd, sock=11 02/06/2003 11:59:0
repeated about 5 times. Those are the only strange log entries I can find,
but they dont' seem to imply a whole lot.
--
Andy King [EMAIL PROTECTED]
"A Turin Turambar turun' ambartanen"
"Guinness for strength"
msg01761/pgp00000.pgp
Description: PGP signature
