Re: [gridengine users] cgroups Integration in OGS/GE 2011.11 update 1

2012-05-25 Thread William Hay
On 24 May 2012 17:11, Rayson Ho ray...@scalablelogic.com wrote: But note that I am not against Univa - I believe Univa is improving Neither am I (except in jest). William ___ users mailing list users@gridengine.org

Re: [gridengine users] cgroups Integration in OGS/GE 2011.11 update 1

2012-05-25 Thread Mark Dixon
On Thu, 24 May 2012, Rayson Ho wrote: ... Not trying to launch any GPLv2 vs GPLv3 arguments here, but one of the reasons why Linux can't be switched to a newer version of the license is that Linus does not own the copyright. Or it might have been because he didn't use the optional GPLv2 phrase

Re: [gridengine users] cgroups Integration in OGS/GE 2011.11 update 1

2012-05-25 Thread Mark Dixon
On Thu, 24 May 2012, berg...@merctech.com wrote: ... If possible, it would be extremely helpful to express memory limits as a percentage of available resources, rather than just as a fixed quantity. Yes, I do the trick of extracting the swap and RAM sizes for each server and putting those

[gridengine users] Reservations and parallel environments

2012-05-25 Thread Richard Ems
Hi list, I searched through the mailing list and tried using google, but couldn't found an answer to my question. The (simplified) situation: We have a group of nodes with 12 slots. We submit there parallel jobs requesting 4 or 12 slots. We defined 2 PEs allocating 4 or 12 slots and associated

Re: [gridengine users] Reservations and parallel environments

2012-05-25 Thread William Hay
On 25 May 2012 10:10, Richard Ems richard@cape-horn-eng.com wrote: Hi list, I searched through the mailing list and tried using google, but couldn't found an answer to my question. The (simplified) situation: We have a group of nodes with 12 slots. We submit there parallel jobs

Re: [gridengine users] Reservations and parallel environments

2012-05-25 Thread Richard Ems
On 05/25/2012 11:31 AM, Daniel Gruber wrote: The expected behavior would be that when there is never an host with 12 slots free, that your cluster will be filled up with 4 slot jobs, even when they have lower priorities. The reservation you gave the 12 slot jobs will be attached at the end

Re: [gridengine users] Reservations and parallel environments

2012-05-25 Thread William Hay
On 25 May 2012 10:31, Daniel Gruber dgru...@univa.com wrote: The expected behavior would be that when there is never an host with 12 slots free, that your cluster will be filled up with 4 slot jobs, even when they have lower priorities. The reservation you gave the 12 slot jobs will be

Re: [gridengine users] Reservations and parallel environments

2012-05-25 Thread Richard Ems
On 05/25/2012 11:51 AM, William Hay wrote: We have a group of nodes with 12 slots. We submit there parallel jobs requesting 4 or 12 slots. We defined 2 PEs allocating 4 or 12 slots and associated to that group of nodes in one queue. What do you mean be PE allocating 4 or 12 slots. You have 4

Re: [gridengine users] Reservations and parallel environments

2012-05-25 Thread Richard Ems
On 05/25/2012 12:04 PM, William Hay wrote: On 25 May 2012 10:31, Daniel Gruber dgru...@univa.com wrote: The expected behavior would be that when there is never an host with 12 slots free, that your cluster will be filled up with 4 slot jobs, even when they have lower priorities. The

Re: [gridengine users] Reservations and parallel environments

2012-05-25 Thread Richard Ems
On 05/25/2012 12:50 PM, William Hay wrote: But thinking more on it ... perhaps because the 3 jobs J[123]_4 - started on a 12 slots node N1 - got started at different times, the last one started could have been started *after* another job J4_12 running on 12 slots on node N2. So job J3_4 will

Re: [gridengine users] Reservations and parallel environments

2012-05-25 Thread Daniel Gruber
Am 25.05.2012 um 12:35 schrieb Richard Ems: On 05/25/2012 12:27 PM, Daniel Gruber wrote: Exactly, looks like your runtime estimation for your slot4 jobs is smaller than for your slot12 jobs. Backfilling must be active here. Did you submit both jobs in exactly the same way with s_rt? Try

Re: [gridengine users] cgroups Integration in OGS/GE 2011.11 update 1

2012-05-25 Thread Mark Dixon
On Thu, 24 May 2012, Rayson Ho wrote: ... 1) It's not a drop in replacement. If upgrading gridengine on an existing system, activating your cgroup code will cause an immediate change in behaviour of jobs, without the user altering their submission flags. People don't tend to like that sort of

Re: [gridengine users] cgroups Integration in OGS/GE 2011.11 update 1

2012-05-25 Thread William Hay
On 25 May 2012 12:27, Mark Dixon m.c.di...@leeds.ac.uk wrote: It may not be defined by POSIX, but we're free to make up a pragmatic working definition :) It could be: Actual memory usage (typically RAM+swap), as measured by the available operating-system specific mechanism. Or OS

[gridengine users] Deleting PE referenced by wildcard

2012-05-25 Thread William Hay
Is there an easy way to delete a PE that is no longer referenced by a queue but which is still matched by a PE wilcard. [root@admin03 tmp]# qconf -dp qlc-KLB denied: Pe qlc-KLB is still referenced in job 546941. Job 546941 requests qlc* which refers both to the PE I want to delete and various

Re: [gridengine users] cgroups Integration in OGS/GE 2011.11 update 1

2012-05-25 Thread Mark Dixon
On Fri, 25 May 2012, William Hay wrote: Out of curiosity, why do people still want the SSH integration code? In my case sshd is convenient to wrap if you want to do something to each task in a job before dropping privileges. Fair enough. Out of curiosity why do people still want gridengine

Re: [gridengine users] Deleting PE referenced by wildcard

2012-05-25 Thread William Hay
On 25 May 2012 14:00, Fritz Ferstl ffer...@univa.com wrote: Hi William, remove it from the queue's pe_list. Then wait for the job(s) to finish and you can delete it. I deleted the only queue that referenced it. There were no jobs in the PE or the queue that had referenced it. The job it

Re: [gridengine users] Deleting PE referenced by wildcard

2012-05-25 Thread William Hay
On 25 May 2012 14:45, Fritz Ferstl ffer...@univa.com wrote: Hhhhmmm ... would need to look into the code whether it even would complain about that for a queued job. It's clear for a running job and if that job was running before and got requeued then it would be easily explicable, I guess.

Re: [gridengine users] Deleting PE referenced by wildcard

2012-05-25 Thread Esztermann, Ansgar
On May 25, 2012, at 16:46 , William Hay wrote: I think I'll leave it for now as the PE doesn't cause any harm I was just looking to eliminate config cruft. We had the same problem about half a year ago; our cluster is quite heterogeneous, and it has a lot of isolated IB switches, so we rely

Re: [gridengine users] cgroups Integration in OGS/GE 2011.11 update 1

2012-05-25 Thread Rayson Ho
On Fri, May 25, 2012 at 8:30 AM, William Hay w@ucl.ac.uk wrote: Or OS dependent, on linux whatever cgroups allows us to control most easily and seems like a good thing to control. May be we can ask the Linux kernel guys to give us a better interface?? Out of curiosity, why do people

Re: [gridengine users] Gridengine and Hadoop

2012-05-25 Thread Ron Chen
Ralph: How common will we see jobs that request dynamic allocations? I have never seen Hadoop presentations talking about them in any BigData conferences. Just also want to mention that Moab is not open-source, and I don't think we will see much information about the integration from Moab.   

Re: [gridengine users] Reservations and parallel environments

2012-05-25 Thread Richard Ems
Hi list again, something seems not be working as I *expect* it to be. Several jobs are waiting and are asking for a reservation, waiting for 12 slots to become free and are at the top of the priority list. Several other jobs with less priority and asking only for 4 slots get backfilled, but

[gridengine users] Requesting h_vmem or memfree

2012-05-25 Thread Prentice Bisbal
Okay, this going to be a stupid question coming from someone who's been on this list for years, but here goes... I've just upgraded the RAM on a few cluster nodes to 32 GB (instead of 16 GB). A few users could benefit from this, so I'd like to be able to specify h_vmem or mem_free or s_vmem for

[gridengine users] max_reservations value

2012-05-25 Thread Richard Ems
Hi list, what happens if there are more reservations being requested by waiting jobs asked for than the value set in max_reservations? Do reservations on running jobs still count for the scheduler reservation plans? Is there any script to analyze the output of the schedule file? Thanks,

[gridengine users] Can I force a parallel job to be run on a single client?

2012-05-25 Thread John Young
If I wanted to be sure that a particular parallel job ran only on a single client node (as opposed to being spread across multiple nodes in our cluster), is there a good way to do that under gridengine? To be clear, I don't care which client is used, but I want the parallel job to run on a

Re: [gridengine users] max_reservations value

2012-05-25 Thread William Hay
On 25 May 2012 19:35, Richard Ems richard@cape-horn-eng.com wrote: Hi list, what happens if there are more reservations being requested by waiting jobs asked for than the value set in max_reservations? Do reservations on running jobs still count for the scheduler reservation plans? The

Re: [gridengine users] Tight SGE-SSH Integration

2012-05-25 Thread Rayson Ho
BTW, we dual-boot the RHEL kernel the UEK (haven't tried UEK2 yet). The advantage of this setup is that we could always go back to the RHEL kernel with just a reboot. Also, we just need to setup OGS/GE (or any other software packages) once and can test it with both kernels. Rayson On Thu, May

Re: [gridengine users] Requesting h_vmem or memfree

2012-05-25 Thread Alex Chekholko
On 05/25/2012 11:34 AM, Prentice Bisbal wrote: Okay, this going to be a stupid question coming from someone who's been on this list for years, but here goes... I've just upgraded the RAM on a few cluster nodes to 32 GB (instead of 16 GB). A few users could benefit from this, so I'd like to be

[gridengine users] about sub access list

2012-05-25 Thread mahbube rustaee
Hi all, In GE6.2u5 , I defined an ACL with sub ACL entries such: nameoctopus-users typeACL fshare 0 oticket 0 entries @octopus-master --- nameoctopus-master typeACL fshare 0 oticket 0 entries sanki -- user_lists parameter of octopus.q set to octopus-users . In this case