On Thu, Feb 06, 2003 at 05:46:54PM -0600, Jeremy Enos wrote:
> It is possible, tho I have my doubts if the DB is even the problem.  You 
> can try it though... When the server is NOT running...
> rm /var/spool/pbs/server_priv/serverdb

Thanks for the help so far.  You're doubts are correct.  I removed the db,
and it still sucks up all the memory.  After looking through the serverlogs,
there are a number of strange errors, but none that indicate more than I
already know.  One job was terminated because of resources:

02/06/2003 11:33:01;0010;PBS_Server;Job;465.vonkarman;Exit_status=143
resources_used.cput=00:00:18 resources_used.mem=3592kb
resources_used.vmem=8360kb resources_used.walltime=211:10:02

but I do not believe it is the job's fault.  It has been running for far
longer before, and on a number of different clusters.  The professors who
are running the code assure me that the code is quite stable.  I have my
doubts, but I don't think its the code in this situation.  The other errors
are Protocol failures on various nodes, and this one:

02/06/2003 11:35:24;0080;PBS_Server;Req;req_reject;Reject reply code=15018,
aux= 0, type=9, from root@vonkarman

and the last error produced before endless "log opened" lines is this:

02/06/2003 11:58:59;0100;PBS_Server;Req;;Type 57 request received from
pbs_mom@n 58.cfd, sock=11 02/06/2003 11:59:0

repeated about 5 times.  Those are the only strange log entries I can find,
but they dont' seem to imply a whole lot.

-- 
Andy King            [EMAIL PROTECTED]
"A Turin Turambar turun' ambartanen"
                   "Guinness for strength"

Attachment: msg01761/pgp00000.pgp
Description: PGP signature

Reply via email to