Michel,
  
  Maui 3.0.7's used an improved technique for tracking job submission 
hosts. This method, while address most problems encountered in Maui 3.0.6, was 
unable to address all possible situations.
  Maui 3.2.6 and higher fully supports job submissions from multiple 
hosts. It addresses all known situations.

Thanks,

Trent Weber
Cluster Resources, Inc.


On Tue, 1 Jul 2003, Michel Beheregaray wrote:

> Hello,
>  I installed an oscar cluster V2.2.1 (with OpenPBS 2.3.16 and
> MAUI 3.0.7p8). 
> 
> ifrcir123.univ-pau.fr is the oscar and pbs server.
> 
> I configured a route queue on another pbs server (ifrcir101.univ-pau.fr,
> outside the oscar cluster)
> 
> create queue rcourt
> set queue rcourt queue_type = Route
> set queue rcourt route_destinations = [EMAIL PROTECTED]
> set queue rcourt enabled = True
> set queue rcourt started = True
> 
> After starting up pbs_server and maui on ifrcir123.univ-pau.fr, I submit 
> a job from ifrcir101 to rcourt queue.
> The job begins to run on ifrcir123 (or one of its nodes). All is right
> Same behaviour if I submit others jobs from the same host (even if different
> users).
> 
> Then I submit another job (or the same script) from ifrcir123 (so locally) on 
> court queue.
> The job keeps in queue ("qstat -a" flag = Q) and here is the maui.log 
> extract :
> 
> 06/30 10:10:19 JobStart(214)
> 06/30 10:10:19 JobDistributeTasks(214,0,NodeList,TaskMap)
> 06/30 10:10:19 AMReserveJobAllocation(214,Reason,ErrMsg)
> 06/30 10:10:19 RMStartJob(214,SC)
> 06/30 10:10:19 PBSStartJob(214,0)
> 06/30 10:10:19 ERROR:    cannot set job '214' attr 'Resource_List:neednodes' to
> 'iplmat133.univ-pau.fr' (rc: 15001 'Unknown Job Id')
> 06/30 10:10:19 ERROR:    cannot set hostlist for job '214')
> 06/30 10:10:19 WARNING:  cannot start job '214' through PBS
> 06/30 10:10:19 WARNING:  cannot start job '214' through resource manager
> 06/30 10:10:19 ALERT:    job '214' deferred after 1 failed start attempts (API
> failure on last attempt)
> 06/30 10:10:19 JobDefer(214,1:00:00,RMFailure,job could not be started through RM)
> 06/30 10:10:19 ALERT:    job '214' cannot run (deferring job for 3600 seconds)
> [001]              214   1:  1:  1(1) ALL    2:00:00(????????) campillo      lcs
>       Idle DEFAULT  [court 1] 1056960619   [NONE] [NONE] [NONE] >=      0 >=   
>   0 [NONE]
> 06/30 10:10:19 AMCancelAllocationReservation([NONE],214,Reason)
> 06/30 10:10:19 ERROR:    cannot run job '214' in partition DEFAULT
> 06/30 10:10:19 ReservePriorityJob(214,DEFAULT,ResCount)
> 06/30 10:10:19 JobReserve(214,Priority)
> 06/30 10:10:19 INFO:     16 feasible tasks found for job 214:0 (1 Needed)
> 06/30 10:10:19 INFO:     16 feasible tasks found for job 214:0 (1 Needed)
> 06/30 10:10:19 INFO:     located resources for 1 tasks (16) in best partition 
> for job 214 at time 0:00:00
> 06/30 10:10:19 INFO:     tasks located for job 214:  1 of 1 required (16 feasible)
> 06/30 10:10:19 JobDistributeTasks(214,0,NodeList,TaskMap)
> 06/30 10:10:19 ReservationJCreate(214,MNodeList,0:00:00,Priority,Res)
> 06/30 10:10:19 INFO:     job '214' reserved 1 tasks (partition DEFAULT) to start
> in 0:00:00 on Mon Jun 30 10:10:19
> 
> Then I kill jobs and restart pbs server and maui on ifrcir123.
> I do the same as before but I begin with submitting job from ifrcir123
> (locally), the job begins to run and all is right. Then I submit a second job 
> from ifrcir101 and the result is the same : job keeps queueing.
> So I cannot submit jobs from different hosts onto the same pbs server.
> 
> I tried with others pbs servers on the network and the result is the same.
> Only the first server which submit the first job is accepted for new jobs.
> Others are rejected.
> 
> But this operation works fine if pbs server which receives the jobs is a
> 2.3.12 pbs server with 3.0.6p3 maui.
> 
> Is there a Maui parameter to configure to accept two or more submitting server 
> at a time ?
> Is there a problem (and a patch ?) with this version of Maui ?
> What made I wrong ?
> 
> Thank you for your help,
> 



-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100006ave/direct;at.asp_061203_01/01
_______________________________________________
Oscar-users mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/oscar-users

Reply via email to