I'm not sure if this is a maui or a torque issue, so I'm being slightly rude 
and sending this to both lists.

We're running maui 3.3-4 and torque 2.5.7-9 on CentOS 6.3.

Most of the time there's no problem.  qsub a set of submit files, they run and 
we get our output, but every now and then, they submit and get held so that 
qstat shows something like this...

[eob_merge@srvBatchHead01 ~]$ qstat
Job id                    Name             User            Time Use S Queue
------------------------- ---------------- --------------- -------- - -----
48267.srvbatchhead01       FS93130003_000   eob_merge              0 Q batch
48268.srvbatchhead01       FS93130003_001   eob_merge              0 Q batch
48269.srvbatchhead01       FS93130003_002   eob_merge              0 Q batch
48270.srvbatchhead01       FS93130003_003   eob_merge              0 Q batch
48271.srvbatchhead01       FS93130003_004   eob_merge              0 Q batch
48272.srvbatchhead01       FS93130006_000   eob_merge              0 Q batch
48273.srvbatchhead01       FS93130006_001   eob_merge              0 Q batch
48274.srvbatchhead01       FS93130006_002   eob_merge              0 Q batch
48275.srvbatchhead01       FS93130006_003   eob_merge              0 Q batch
48276.srvbatchhead01       FS93130006_004   eob_merge              0 Q batch

and they'll sit there forever like this.  We will restart all of the associated 
services: maui, pbs_server, pbs_mom and munge, yet, it doesn't help.  Finally 
we just reboot all of the boxes in the cluster (fortunately, it's a small 
number) and everything comes back up and runs.

I've proposed a weekly reboot of everything, but have been told that this can 
only be a stop gap measure and cannot be the final solution.

Does anyone have any clues?

Kind regards,

Jack Wilkinson
Services | VPay(r)
P: 972.367-6622
jwilkin...@stoneeagle.com<mailto:jwilkin...@stoneeagle.com>
www.stoneeagle.com<http://www.stoneeagle.com/>
www.vpayusa.com<http://www.vpayusa.com/>

111 W. Spring Valley Rd., #100
Richardson, TX 75081

CONFIDENTIALITY NOTICE: This email, including any attachments, is for the sole 
use of the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized review, use, disclosure, or distribution is 
prohibited. If you received this email and are not the intended recipient, 
please inform the sender by email reply and destroy all copies of the original 
message.
_______________________________________________
mauiusers mailing list
mauiusers@supercluster.org
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to