Dear David, I´m sorry to bother You again with this issue, but the problem still exists.
Please have a look onto this example: - I submitted a job like this: qsub -q fwd -l nodes=1:ppn=4 -I -l walltime=12:00:00 - maui.log tells me that the job cannot be started: 03/21 14:30:11 MRMJobStart(784238,Msg,SC) 03/21 14:30:11 MPBSJobStart(784238,base,Msg,SC) 03/21 14:30:11 ERROR: job '784238' cannot be started: (rc: 15046 errmsg: 'Resource temporarily unavailable MSG=job allocation request exceeds currently available cluster nodes, 1 requested, 0 available' hostlist: 'fluid001:ppn=4') 03/21 14:30:11 ERROR: cannot start job '784238' in partition DEFAULT 03/21 14:30:11 MJobPReserve(784238,DEFAULT,ResCount,ResCountRej) 03/21 14:30:30 job '784238' State: Idle EState: Idle QueueTime: Tue Mar 21 14:29:50 - checkjob knows that on this particular node there are 16 CPU cores and thinks that 9 are in use: checking node fluid001 State: Running (in current state for 00:00:00) Expected State: Idle SyncDeadline: Sat Oct 24 14:26:40 Configured Resources: PROCS: 16 MEM: 62G SWAP: 62G DISK: 1M Utilized Resources: SWAP: 10G Dedicated Resources: PROCS: 9 Opsys: ubuntu Arch: x64 Speed: 1.00 Load: 15.030 Network: [DEFAULT] Features: [NONE] Attributes: [Batch] Classes: [default 16:16][fwd 7:16][fwi 16:16][short 16:16][long 16:16][benchmark 16:16][fwo 16:16] Total Time: INFINITY Up: INFINITY (98.92%) Active: INFINITY (93.87%) Reservations: Job '772551'(x1) -6:05:29:39 -> 2:02:30:21 (8:08:00:00) Job '772553'(x1) -6:05:29:39 -> 2:02:30:21 (8:08:00:00) Job '772555'(x1) -6:05:29:39 -> 2:02:30:21 (8:08:00:00) Job '772557'(x1) -6:05:29:39 -> 2:02:30:21 (8:08:00:00) Job '779684'(x1) -2:20:22:38 -> 5:11:37:22 (8:08:00:00) Job '779685'(x1) -2:20:22:38 -> 5:11:37:22 (8:08:00:00) Job '781758'(x1) -1:19:54:49 -> 6:12:05:11 (8:08:00:00) Job '783132'(x1) -1:00:19:39 -> 7:07:40:21 (8:08:00:00) Job '783909'(x1) -6:19:42 -> 8:01:40:18 (8:08:00:00) User 'fluid.0.0'(x1) -00:03:52 -> INFINITY ( INFINITY) Blocked Resources@00:00:00 Procs: 7/16 (43.75%) Blocked Resources@2:02:30:21 Procs: 11/16 (68.75%) Blocked Resources@5:11:37:22 Procs: 13/16 (81.25%) Blocked Resources@6:12:05:11 Procs: 14/16 (87.50%) Blocked Resources@7:07:40:21 Procs: 15/16 (93.75%) Blocked Resources@8:01:40:18 Procs: 16/16 (100.00%) JobList: 772551,772553,772555,772557,779684,779685,781758,783132,783909 - with qstat I can see that there is only one free slot on the node and 15 are used by the jobs: qstat -ae -n | grep fluid001 fluid001/0 fluid001/9 fluid001/11 fluid001/13 fluid001/5,7 fluid001/14-15 fluid001/1 fluid001/2-4,6 fluid001/8,10 - The node has 9 running jobs, but the syntax of the allocations is still misunderstood by maui. Do I have to switch to a newer version of Torque? Currently I am using version 5.1.1. Thanks in advance, Henrik > Am 22.08.2016 um 18:21 schrieb David Beer <db...@adaptivecomputing.com>: > > This incompatibility exists for all versions of Torque > 5. It has been fixed > in the Maui source, but no official release has been made. You can grab the > new source from svn: > > svn co svn://opensvn.adaptivecomputing.com/maui > <http://opensvn.adaptivecomputing.com/maui> > > After that you can build it as you would a normal tarball. > > On Sat, Aug 20, 2016 at 3:59 AM, Guangping Zhang <zgp...@126.com > <mailto:zgp...@126.com>> wrote: > Dear all, > > I found the Torque 6.0.2 not work properly with Maui 3.3.1 time to time. > > And I found in the log file of maui that > > 08/20 17:14:12 INFO: PBS node node04 set to state Idle (free) > 08/20 17:14:12 INFO: node 'node04' changed states from Running to Idle > 08/20 17:14:12 MPBSNodeUpdate(node04,node04,Idle,NODE00) > 08/20 17:14:12 INFO: node node04 has joblist '0-9/248.node00' > 08/20 17:14:12 ALERT: cannot locate PBS job '0-9' (running on node node04) > > where 0-9 not jobs but the allocated procs for job 248.node00. So, will this > prevent torque to work good along with maui ? > > Thanks for your discussion. > > /Guangping > > > _______________________________________________ > torqueusers mailing list > torqueus...@supercluster.org <mailto:torqueus...@supercluster.org> > http://www.supercluster.org/mailman/listinfo/torqueusers > <http://www.supercluster.org/mailman/listinfo/torqueusers> > > > > > -- > David Beer | Torque Architect > Adaptive Computing > _______________________________________________ > torqueusers mailing list > torqueus...@supercluster.org > http://www.supercluster.org/mailman/listinfo/torqueusers
_______________________________________________ mauiusers mailing list mauiusers@supercluster.org http://www.supercluster.org/mailman/listinfo/mauiusers