I suspect that the scripts which OSCAR runs on detect the cpu count but do not detect multiple cores as separate "processors". If this is the case, this is a fairly major issue and needs to be addressed soon. Multi-core processors are becoming more and more common. I don't have any hardware to test this on myself yet, but I should in a couple months.
On the plus side, it ought to be a fairly easy fix... On 22 Jan 2007 20:37:59 -0500, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > Thanks for the message! > > Actually, I only have two client nodes. One have 4 dual-core AMD CPUs > (total 8 logic CPUs). The other host has 2 dual-core CPUs (total 4 logic > CPUs). > > I use "qsub ./script.sh" to submit jobs. All the jobs use the same script. > > If I submit 12 jobs, only 6 jobs runs (4 on the first node and 2 on the > other node) at the same time. > > Here is what I get if I do "print server" from qmgr. Do I need to change > anything? > > ====================================================== > > # > # Create queues and set their attributes. > # > # > # Create and define queue workq > # > create queue workq > set queue workq queue_type = Execution > set queue workq resources_max.cput = 10000:00:00 > set queue workq resources_max.ncpus = 12 > set queue workq resources_max.nodect = 2 > set queue workq resources_max.walltime = 10000:00:00 > set queue workq resources_min.cput = 00:00:01 > set queue workq resources_min.ncpus = 1 > set queue workq resources_min.nodect = 1 > set queue workq resources_min.walltime = 00:00:01 > set queue workq resources_default.cput = 10000:00:00 > set queue workq resources_default.ncpus = 1 > set queue workq resources_default.nodect = 1 > set queue workq resources_default.walltime = 10000:00:00 > set queue workq resources_available.nodect = 2 > set queue workq enabled = True > set queue workq started = True > # > # Set server attributes. > # > set server scheduling = True > set server default_queue = workq > set server log_events = 64 > set server mail_from = adm > set server query_other_jobs = True > set server resources_available.ncpus = 12 > set server resources_available.nodect = 2 > set server resources_available.nodes = 2 > set server resources_max.ncpus = 12 > set server resources_max.nodes = 2 > set server scheduler_iteration = 60 > set server node_check_rate = 150 > set server tcp_timeout = 6 > set server pbs_version = 2.0.0p8 > > =========================================================== > > > Here is maui.cfg file > > =========================================================== > > # maui.cfg 3.2.6p14 > > SERVERHOST photon.bwh.harvard.edu > # primary admin must be first in list > ADMIN1 root > > # Resource Manager Definition > > RMCFG[DUAL.EFOCHT.DE] TYPE=PBS > > # Allocation Manager Definition > > AMCFG[bank] TYPE=NONE > > # full parameter docs at > http://clusterresources.com/mauidocs/a.fparameters.html > # use the 'schedctl -l' command to display current configuration > > RMPOLLINTERVAL 00:00:10 > > SERVERPORT 42559 > SERVERMODE NORMAL > > # Admin: http://clusterresources.com/mauidocs/a.esecurity.html > > > LOGFILE maui.log > LOGFILEMAXSIZE 10000000 > LOGLEVEL 3 > > # Job Priority: > http://clusterresources.com/mauidocs/5.1jobprioritization.html > > QUEUETIMEWEIGHT 1 > > # FairShare: http://clusterresources.com/mauidocs/6.3fairshare.html > > #FSPOLICY PSDEDICATED > #FSDEPTH 7 > #FSINTERVAL 86400 > #FSDECAY 0.80 > > # Throttling Policies: > http://clusterresources.com/mauidocs/6.2throttlingpolicies.html > > # NONE SPECIFIED > > # Backfill: http://clusterresources.com/mauidocs/8.2backfill.html > > BACKFILLPOLICY ON > RESERVATIONPOLICY CURRENTHIGHEST > > # Node Allocation: > http://clusterresources.com/mauidocs/5.2nodeallocation.html > > NODEALLOCATIONPOLICY MINRESOURCE > > # QOS: http://clusterresources.com/mauidocs/7.3qos.html > > # QOSCFG[hi] PRIORITY=100 XFTARGET=100 FLAGS=PREEMPTOR:IGNMAXJOB > # QOSCFG[low] PRIORITY=-1000 FLAGS=PREEMPTEE > > # Standing Reservations: > http://clusterresources.com/mauidocs/7.1.3standingreservations.html > > # SRSTARTTIME[test] 8:00:00 > # SRENDTIME[test] 17:00:00 > # SRDAYS[test] MON TUE WED THU FRI > # SRTASKCOUNT[test] 20 > # SRMAXTIME[test] 0:30:00 > > # Creds: http://clusterresources.com/mauidocs/6.1fairnessoverview.html > > # USERCFG[DEFAULT] FSTARGET=25.0 > # USERCFG[john] PRIORITY=100 FSTARGET=10.0- > # GROUPCFG[staff] PRIORITY=1000 QLIST=hi:low QDEF=hi > # CLASSCFG[batch] FLAGS=PREEMPTEE > # CLASSCFG[interactive] FLAGS=PREEMPTOR > > NODEACCESSPOLICY > > ===================================================================== > > > > > Check ganglia at http://localhost/ganglia and see where those 6 jobs > > are. Make sure in particular they are not all sitting on one node or > > something silly. If you have 6 nodes and they are one per node, then > > the queue is probably set up to reserve an entire node for each > > process. There is a flag in the torque config file (I think) that > > tells it to do this. > > > > Could you post the script you are queueing with and the qsub command > > you use to submit the job? > > > > Are you running an smp kernel on your head node, I assume? If you > > happened to be running in the non-smp version when you installed > > torque/maui probably don't know that there is more than one processor > > available... > > > > Hopefully this gives you some thoughts as to where to start looking... > > > > On 22 Jan 2007 13:13:02 -0500, Jinsong Ouyang <[EMAIL PROTECTED]> > > wrote: > >> > >> > >> > >> > >> I am using OSCAR 5.0 & Fedora 5.0 x86_64. I have total 12 logic CPUs on > >> computing nodes. I use qsub to submit jobs and can only have maximum 6 > >> jobs > >> running simultaneously. Half of the CPUs are not used. Could anyone > >> please > >> tell me how to increase the number of running jobs? I tried to set > >> max_running using qmgr. It does not seem to change anything. Do I need > >> to > >> change anything in maui.cfg? > >> > >> > >> > >> Many thanks, > >> > >> > >> > >> JO > >> ------------------------------------------------------------------------- > >> Take Surveys. Earn Cash. Influence the Future of IT > >> Join SourceForge.net's Techsay panel and you'll get the chance to share > >> your > >> opinions on IT & business topics through brief surveys - and earn cash > >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > >> > >> _______________________________________________ > >> Oscar-users mailing list > >> [email protected] > >> https://lists.sourceforge.net/lists/listinfo/oscar-users > >> > >> > >> > > > > ------------------------------------------------------------------------- > > Take Surveys. Earn Cash. Influence the Future of IT > > Join SourceForge.net's Techsay panel and you'll get the chance to share > > your > > opinions on IT & business topics through brief surveys - and earn cash > > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > > _______________________________________________ > > Oscar-users mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/oscar-users > > > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Oscar-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/oscar-users > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Oscar-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/oscar-users
