date:20160125

Re: [gridengine users] "cannot run until clean up of an previous run has finished"

2016-01-25 Thread Marlies Hankel

Hi all, Thank you. So I really have to remove the node from SGE and then put it back in? Or is there an easier way? I checked the spool directory and there is nothing in there. Also, funny enough the node does accept jobs if they are single node jobs. For example, there are two nodes free, n

Re: [gridengine users] RoundRobin scheduling among users

2016-01-25 Thread Skylar Thompson

On Mon, Jan 25, 2016 at 10:17:16PM +0100, Reuti wrote: > > Am 25.01.2016 um 20:34 schrieb Skylar Thompson: > > > Yep, we use functional tickets to accomplish this exact goal. Every user > > gets 1000 functional tickets via auto_user_fshare in sge_conf(5), though > > your exact number will depend

Re: [gridengine users] RoundRobin scheduling among users

2016-01-25 Thread Reuti

Am 25.01.2016 um 20:34 schrieb Skylar Thompson: > Yep, we use functional tickets to accomplish this exact goal. Every user > gets 1000 functional tickets via auto_user_fshare in sge_conf(5), though > your exact number will depend on the number tickets and weights you have > elsewhere in your poli

Re: [gridengine users] nodes group

2016-01-25 Thread Reuti

On the one hand you could specify a list of nodes and using a wildcard for the queue: $ qsub -q "*@node01,*@node02" ... First simplification could be to define a host group for these particular nodes (it's *not* necessary to use this hostgroup in the queue definition too). $ qsub -q "*@@i5" ..

Re: [gridengine users] RoundRobin scheduling among users

2016-01-25 Thread Skylar Thompson

Yep, we use functional tickets to accomplish this exact goal. Every user gets 1000 functional tickets via auto_user_fshare in sge_conf(5), though your exact number will depend on the number tickets and weights you have elsewhere in your policy configuration. On Mon, Jan 25, 2016 at 11:25:53AM -080

[gridengine users] RoundRobin scheduling among users

2016-01-25 Thread Christopher Heiny

Hi all, We've been using GridEngine for several years now, currently OGS 2011.11p1 on Fedora 20 installed from Fedora RPMs. Our job mix is mostly embarassingly parallel - we use array jobs to dispatch up to 100 tasks, each of which might require 1, 16, 32, or 64 cores. Each job takes up a signi

Re: [gridengine users] nodes group

2016-01-25 Thread Reuti

Hi, Am 25.01.2016 um 19:49 schrieb Dimar Jaime González Soto: > Hi everyone, I need to execute a program in a few nodes(in my case slave > nodes), how can I specify that trough the command line? Do you refer by "few nodes" to a parallel job, or that a serial job may only run on some nodes due

[gridengine users] nodes group

2016-01-25 Thread Dimar Jaime González Soto

Hi everyone, I need to execute a program in a few nodes(in my case slave nodes), how can I specify that trough the command line? -- Atte. Dimar González Soto Ingeniero Civil en Informática Universidad Austral de Chile ___ users mailing list users@gride

Re: [gridengine users] "cannot run until clean up of an previous run has finished"

2016-01-25 Thread Reuti

Hi, > Am 25.01.2016 um 00:46 schrieb Marlies Hankel : > > Hi all, > > Over the weekend something seems to have gone wrong with one of the nodes in > our cluster. We get the error: > > cannot run on host "cpu-1-3.local" until clean up of an previous run has > finished > > > I have restarted

Re: [gridengine users] Weird prolog behavior

2016-01-25 Thread Taras Shapovalov

Thank you guys for the advise! Best regards, Taras On Mon, Jan 25, 2016 at 5:54 PM, William Hay wrote: > On Mon, Jan 25, 2016 at 01:05:50PM +0100, Fritz Ferstl wrote: > > One could also wrap the shepherd if what you wanted to do is check for > the > > working directory and potentially create

Re: [gridengine users] Weird prolog behavior

2016-01-25 Thread William Hay

On Mon, Jan 25, 2016 at 01:05:50PM +0100, Fritz Ferstl wrote: > One could also wrap the shepherd if what you wanted to do is check for the > working directory and potentially create or mount it if it isn't there yet > before exec'ing the real shepherd. All so called methods (prolog, starter, > pe-*

[gridengine users] Same job consumes memory in a different way

2016-01-25 Thread sudha.penmetsa

Hi, We have launched same job at different times, the h_vmem is defined as 12GB. One job consumed only 10.5G and is successful while the other consumed 18.3G and thus was killed. 01/13/2016 00:13:18|execd|test1|W|job 33452 exceeds job hard limit "h_vmem" of queue "test.q@test1

Re: [gridengine users] Weird prolog behavior

2016-01-25 Thread Fritz Ferstl

One could also wrap the shepherd if what you wanted to do is check for the working directory and potentially create or mount it if it isn't there yet before exec'ing the real shepherd. All so called methods (prolog, starter, pe-*, etc) are run by the shepherd. So by wrapping it you can precede

Re: [gridengine users] Weird prolog behavior

2016-01-25 Thread William Hay

On Mon, Jan 25, 2016 at 01:33:09PM +0300, Taras Shapovalov wrote: >Hi guys, >We have faced with uncharacteristic (for other workload mangers) behavior >of OGS 2011.11p1 (probably UGE has the same behavior, not sure yet). >Prolog is called always after stderr/out files are created. T

[gridengine users] Weird prolog behavior

2016-01-25 Thread Taras Shapovalov

Hi guys, We have faced with uncharacteristic (for other workload mangers) behavior of OGS 2011.11p1 (probably UGE has the same behavior, not sure yet). Prolog is called always after stderr/out files are created. This means that if prolog creates some directories that are not exist before and std

Re: [gridengine users] "cannot run until clean up of an previous run has finished"

Re: [gridengine users] RoundRobin scheduling among users

Re: [gridengine users] RoundRobin scheduling among users

Re: [gridengine users] nodes group

Re: [gridengine users] RoundRobin scheduling among users

[gridengine users] RoundRobin scheduling among users

Re: [gridengine users] nodes group

[gridengine users] nodes group

Re: [gridengine users] "cannot run until clean up of an previous run has finished"

Re: [gridengine users] Weird prolog behavior

Re: [gridengine users] Weird prolog behavior

[gridengine users] Same job consumes memory in a different way

Re: [gridengine users] Weird prolog behavior

Re: [gridengine users] Weird prolog behavior

[gridengine users] Weird prolog behavior

15 matches

Site Navigation

Mail list logo

Footer information