Re: [gridengine users] Rescheduling jobs leaving zombie process on compute node

2017-05-05 Thread Ben De Luca
ine, how do I add >> >> ENABLE_ADDGRP_KILL=TRUE >> >> >> To config via console? I would like to apply this to all the queues. >> >> Regards, >> Guillermo >> >> >> >> On 04/30/2017 10:49 AM, Reuti wrote: >> >> -BEGIN

Re: [gridengine users] Rescheduling jobs leaving zombie process on compute node

2017-04-29 Thread Ben De Luca
I see this with processes that spawn another processes, when you kill the parent the children go free. On 29 April 2017 at 20:37, Reuti wrote: > Hi, > > Am 28.04.2017 um 08:57 schrieb Guillermo Marco Puche: > > > I'm expecting a weird behavior when I reschedule any job on my work > cluster. > >

Re: [gridengine users] Parallel jobs with flexible slot requests cause huge memory use

2014-03-10 Thread Ben De Luca
having there breaks. On Mon, Mar 10, 2014 at 5:45 PM, Ben De Luca wrote: > You hit this with reservation + parallel jobs. > > > On Mon, Mar 10, 2014 at 5:17 PM, Joshua Baker-LePain wrote: > >> On Sat, 8 Mar 2014 at 11:50am, Ben De Luca wrote >> >> >> Yes I h

Re: [gridengine users] Parallel jobs with flexible slot requests cause huge memory use

2014-03-10 Thread Ben De Luca
You hit this with reservation + parallel jobs. On Mon, Mar 10, 2014 at 5:17 PM, Joshua Baker-LePain wrote: > On Sat, 8 Mar 2014 at 11:50am, Ben De Luca wrote > > > Yes I have hit this, reservation needs to be off for all jobs. >> > > Ouch. So any jobs using reservati

Re: [gridengine users] Parallel jobs with flexible slot requests cause huge memory use

2014-03-08 Thread Ben De Luca
Yes I have hit this, reservation needs to be off for all jobs. I found the section of the code allocating the memory and as far as I Can tell commenting it does nothing. If you look through the past emails on the list you will see me writing about it this time (almost exactly + 2 weeks) 2 years ag

Re: [gridengine users] some jobs need a pty

2013-11-14 Thread Ben De Luca
aah, I just saw it has no effect. I use that flag on our systems. On Fri, Nov 15, 2013 at 12:49 AM, Ben De Luca wrote: > Can you do this with the submit flags? -pty y[es]/n[o] > > > > > On Tue, Nov 12, 2013 at 5:46 PM, Reuti wrote: > >> Am 12.11.2013 um 17:

Re: [gridengine users] some jobs need a pty

2013-11-14 Thread Ben De Luca
Can you do this with the submit flags? -pty y[es]/n[o] On Tue, Nov 12, 2013 at 5:46 PM, Reuti wrote: > Am 12.11.2013 um 17:15 schrieb Mechanic, Daniel: > > > Ugh. > > > > You are correct, I spoke to soon. I should have said 'my planned > workaround' > > > > This workaround does NOT work. > >

[gridengine users] Copying config between versions

2013-02-07 Thread Ben De Luca
Does anyone have a sensible and automated method for copying configuration data between grid versions? ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Raising an old share tree bug...

2013-02-07 Thread Ben De Luca
Hi all, I just tried this same test on OGS/GE 2011.11p1 and it works perfectly. On Wed, Feb 6, 2013 at 6:28 PM, Ben De Luca wrote: > I have a 8.0.0c cluster in production and an 8.0.0e running for testing. > > No one has noticed it, though I have seen it before > Oh, I jus

Re: [gridengine users] Raising an old share tree bug...

2013-02-06 Thread Ben De Luca
. On Wed, Feb 6, 2013 at 2:05 PM, Orlando Richards wrote: > Hi Ben, > > > On 06/02/13 13:12, Ben De Luca wrote: > >> Im fairly sure we are affected by this bug too, I am happy to help in >> the hunt and I have looked through the code more than once. >> >> > A

Re: [gridengine users] Raising an old share tree bug...

2013-02-06 Thread Ben De Luca
Im fairly sure we are affected by this bug too, I am happy to help in the hunt and I have looked through the code more than once. Which version of grid are you trying to fix? I havn't been following grid dev too closely do we still have multiple forks? On Wed, Feb 6, 2013 at 12:07 PM, Mark Dixo

[gridengine users] Linux OOM killer oom_adj

2012-08-29 Thread Ben De Luca
I was wondering, how people deal with oom conditions on there cluster. We constantly have machines that die because the oom killer takes out critical system services. Has any experiance with the oom_adj proc value, or a patch to grid to support it? /proc/[pid]/oom_adj (since Linux 2.6.11)

Re: [gridengine users] Security hole in most versions of Grid Engine

2012-04-20 Thread Ben De Luca
Ron and Dave. I appreciate the work that both of you have done. But can you both please stop. I would appreciate that too. Thanks -Ben ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users

[gridengine users] Grid engine debug values.

2012-02-14 Thread Ben De Luca
Hi all, I am trying to debug an issue with our scheduler at the moment and I am aware that the values in the debug environment variable, SGE_DEBUG_LEVEL. At one point I was aware what each of the values refer to but that was long in the past. I cant seem to find any thing in the man pages

Re: [gridengine users] Signal sent to processes on requeue

2012-01-31 Thread Ben De Luca
http://gridscheduler.sourceforge.net/htmlman/htmlman5/queue_conf.html terminate_method looks like it might be simplest. On Tue, Jan 31, 2012 at 9:51 PM, Ben De Luca wrote: > Strangely im pondering this issue at the moment. If a python process > is killed any process started with subp

Re: [gridengine users] Signal sent to processes on requeue

2012-01-31 Thread Ben De Luca
Strangely im pondering this issue at the moment. If a python process is killed any process started with subprocess does not die. The two methods im following, 1. a reaper, the grid job starts a python process (parent) that starts two other jobs, the task, and the reaper. A. The task is the proces

Re: [gridengine users] Per-host licensing?

2012-01-15 Thread Ben De Luca
a whole machine but mantra I'd run more than one. > That's only going to become more common as we move to machines with > piles of cores in them.  Some things scale across the cores but some > don't and you're better off running two frames simultaneously. > >

Re: [gridengine users] Per-host licensing?

2012-01-15 Thread Ben De Luca
Sure :) Those are all relatively new products, or new licensing schemes. Is it generally sensible to run multiple instances of any of those packages (possibly excluding rvio)? On Sun, Jan 15, 2012 at 6:50 PM, Hugh Macdonald wrote: > On 15/01/12 17:39, Ben De Luca wrote: > > I think

Re: [gridengine users] Per-host licensing?

2012-01-15 Thread Ben De Luca
I think you will find that per host licensing is the exception rather than the rule. ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users

[gridengine users] Is there an upstream?

2012-01-11 Thread Ben De Luca
I know there are a few forks now, sge, univa, ogs. But is there an upstream? who do I contribute to? or is everyone going their separate ways? ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users

[gridengine users] queue slots per machine

2012-01-03 Thread Ben De Luca
Hi, I wonder if I am miss remembering, but is there a way to configure a queue to have the same number of slots as there are NCOR (or even NCPU) as per machine. I seem to remember doing this though I may have set this with hostlist some how? I am running SGE 8.0.0e (son of gridengine)

Re: [gridengine users] Per-host licensing?

2011-12-29 Thread Ben De Luca
If people can say? what software are you running? On Thu, Dec 29, 2011 at 7:39 PM, Ciaran Wills wrote: > I think in my case I can just set up a queue with a number of instances equal > to the number of licenses I have.  I'm not so much worried about maximum > efficiency as avoiding having job

Re: [gridengine users] troubles with xml output on GE 6.2u5

2011-12-29 Thread Ben De Luca
Heres a code snippet I wrote yesterday to handle the parsing of xml from the latest son of ge. from xml.etree import cElementTree as ElementTree cmd = "qstat -ext -xml -u \*" proc = subprocess.Popen(cmd, stdout = subprocess.PIPE, std

[gridengine users] vfx/animation users of grid

2011-12-29 Thread Ben De Luca
Hi All, I know there are a lot of science types on here, but I occasionally see the old animation/vfx person. I wondered how many of you out there might be on that side of things? ___ users mailing list users@gridengine.org https://gridengine.org/ma

Re: [gridengine users] Courtesy binaries including Berkeley?

2011-10-25 Thread Ben De Luca
Hi David, What platform are you building for ? ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users

[gridengine users] Slotwise preemption and license usage.

2011-09-27 Thread Ben De Luca
I have this scenario. I have: a low priority job A that requires a consumable resource called license. a high priority job B that requires the same consumable resource called license. A has checkpoint configuration, where it will checkpoint on job suspend. I on each computer I have two que

Re: [gridengine users] Looking for Pre-Sales Engineer for Grid Engine.

2011-08-16 Thread Ben De Luca
I appreciate the bug fixes :) ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users