[gridengine users] Simplifying Parallel Environments

2012-02-01 Thread Brian Smith
ady had some success with beta testers, greatly simplifying their submit scripts (for more complex cases) and shortening our documentation :) The project page is at: https://github.com/brichsmith/gepetools Suggestions for added features would be appreciated! Brian Smith Senior Systems Admin

Re: [gridengine users] Simplifying Parallel Environments

2012-02-02 Thread Brian Smith
On 02/02/2012 11:52 AM, Mark Dixon wrote: On Wed, 1 Feb 2012, Brian Smith wrote: I've started a github page for some tools I've put together from various bits of code, how-tos, etc. to simplify the setup of parallel environments so that they work universally for all MPI implement

[gridengine users] pe_list format

2012-07-11 Thread Brian Smith
. Would it work? -Brian Brian Smith Sr. System Administrator Research Computing, University of South Florida 4202 E. Fowler Ave. SVC4010 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu ___ users mailing list users@gridengin

Re: [gridengine users] pe_list format

2012-07-11 Thread Brian Smith
est file to pull off the desired behavior, and keep only a small set of PEs available. -Brian Brian Smith Sr. System Administrator Research Computing, University of South Florida 4202 E. Fowler Ave. SVC4010 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu On 07/11/2012 04:28 P

[gridengine users] schedule file mysteries

2012-07-24 Thread Brian Smith
this file represents the decision tree that is generated every schedule interval, so I assume every job should be mentioned. Is this not the case? Any insight is appreciated. -Brian Brian Smith Sr. System Administrator Research Computing, University of South Florida 4202 E. Fowler Ave. SVC4

[gridengine users] Handling Time Slot Differentiation

2012-08-16 Thread Brian Smith
What say other GridEngine gurus about this approach? I believe this will help with my resource reservation woes and at the very least, should make my scheduler iterations much shorter. Is there a better way? Are there any potential pitfalls I may have missed? Any

Re: [gridengine users] Handling Time Slot Differentiation

2012-08-16 Thread Brian Smith
nyone (outside of Sun/Oracle/Univa-internal spent time optimizing the queue configuration with some emphasis on schedule iteration performance? Seems like a neat area. Thanks, -Brian Brian Smith Sr. System Administrator Research Computing, University of South Florida 4202 E. Fowler Ave. SVC4010 O

Re: [gridengine users] Handling Time Slot Differentiation

2012-08-18 Thread Brian Smith
ce improvement. Thanks to everyone who provided input on this. I'll post the results of my resource reservation tests also. -Brian On Fri, Aug 17, 2012 at 3:33 PM, Stuart Barkley wrote: > On Thu, 16 Aug 2012 at 12:07 -, Brian Smith wrote: > > > { > >name

Re: [gridengine users] PE Job Starvation and Job Reservation

2012-08-18 Thread Brian Smith
y >>> >>> However, the job arrays under job #2427 keep PE job #2427 from running. >>> >>> What other settings do I need to check for to fix this? >>> >>> I am suing binary GE2011.11 >>> >>> Thanks, >>> Joseph >>&

Re: [gridengine users] Handling Time Slot Differentiation

2012-08-18 Thread Brian Smith
tatus tag. I'm going to have one of my guys implement that here. That should help eek out the extra stability we're looking for. Thanks, -Brian On Fri, Aug 17, 2012 at 4:28 AM, William Hay wrote: > On 16 August 2012 17:07, Brian Smith wrote: > > > > I want to ditch t

[gridengine users] Verifying behavior of max_reservations

2012-08-24 Thread Brian Smith
): 3% > 128 cores 7% 64-128 cores 15% 32-64 cores 20% 16-32 cores 25% 8-16 cores 18% 2-8 cores 12% 1 core I can crank the value up to 512 and my scheduler intervals are still small. I have ~450 nodes and ~4500 cpus. -Brian -- Brian Smith Sr. System Administrator Research Computing, Univers

Re: [gridengine users] Verifying behavior of max_reservations

2012-08-29 Thread Brian Smith
ntage points of utilization :) Thanks for the link to the design documents. They'll be very helpful. -Brian Brian Smith Sr. System Administrator Research Computing, University of South Florida 4202 E. Fowler Ave. SVC4010 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu On

Re: [gridengine users] Linux OOM killer oom_adj

2012-08-29 Thread Brian Smith
you should be looking to schedule your memory usage accordingly. Oom killer shouldn't be a factor if memory is handled as a scheduler consideration. -Brian Brian Smith Sr. System Administrator Research Computing, University of South Florida 4202 E. Fowler Ave. SVC4010 Office Phone:

Re: [gridengine users] Linux OOM killer oom_adj

2012-08-29 Thread Brian Smith
tasks (qrsh), it generally eats up more VMEM than the slave tasks do. It was just hard to get right. -Brian Brian Smith Sr. System Administrator Research Computing, University of South Florida 4202 E. Fowler Ave. SVC4010 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu On 08/29

Re: [gridengine users] Son of Grid Engine 8.1.2 available

2012-08-30 Thread Brian Smith
I'm using mem_free without issue w/ 8.1.1. What is the output of qconf -se global Is mem_free in the report_variables parameter? -Brian On 08/28/2012 07:37 PM, Joseph Farran wrote: Hi Reuti. Here it is with the additional info: $ qrsh -w v -q bio -l mem_free=190G Job 1637 (-l h_rt=604800,m

Re: [gridengine users] Linux OOM killer oom_adj

2012-08-30 Thread Brian Smith
ory use in order to make the job run. Reuti, have you dealt with this problem? Brian, could you share the memkiller script you use? Thanks, Peter On 08/29/2012 06:09 PM, Brian Smith wrote: We found h_vmem to be highly unpredictable, especially with java-based applications. Stack settings were

Re: [gridengine users] Linux OOM killer oom_adj

2012-08-30 Thread Brian Smith
t -d'=' -f2) ((mem-=extraSpace)) java -mx${mem} myClass ... This stackoverflow piece seems to cover it nicely: http://stackoverflow.com/questions/7412619/what-are-the-advantages-of-specifiying-memory-limit-of-java-virtual-machine -Brian Brian Smith Sr. System Administrator Research Com

Re: [gridengine users] Verifying behavior of max_reservations

2012-08-30 Thread Brian Smith
this particular issue. Is that even possible? -Brian On Thu, Aug 30, 2012 at 5:31 PM, Dave Love wrote: > Brian Smith writes: > > > I have a mix of high-throughput and long wait jobs. We classify and > > prioritize jobs based on runtime. We use a jsv to set > > > >

Re: [gridengine users] Verifying behavior of max_reservations

2012-08-30 Thread Brian Smith
Actually, I'm wrong. That isn't the behavior I'm seeing (blocking due to reserving the complex devel-short-medium-long-xlong). Scratch that. On Thu, Aug 30, 2012 at 8:20 PM, Brian Smith wrote: > I think that the devel-...-xlong parameters being consumable is causing > unf

[gridengine users] Spooling to a database (besides bdb)

2012-08-31 Thread Brian Smith
s has been asked/covered before. -Brian -- Brian Smith Sr. System Administrator Research Computing, University of South Florida 4202 E. Fowler Ave. SVC4010 Office Phone: +1 813 974-1467 Organization URL: http://rc.usf.edu ___ users mailing list users@grid

Re: [gridengine users] Son of Grid Engine 8.1.2 available

2012-09-13 Thread Brian Smith
he mis-reporting of memory in 6.2u5)? > > Correct. - Reuti > > > > > -- > > Community Grid Engine: http://arc.liv.ac.uk/SGE/ > > > ___ > users mailing list > users@gridengine.org

Re: [gridengine users] failing mem_free request

2012-09-17 Thread Brian Smith
o the complex_values list for my exec hosts, we successfully worked-around the issue. I'm working on getting a virtualized development environment online to test against. Once I do, I'll provide more output on the problem. -Brian Brian Smith Assistant Director, Research Computing Informa

Re: [gridengine users] new users getting little action on the queue

2012-09-18 Thread Brian Smith
ny ideas? Lars ___ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users -- Brian Smith Assistant Director, Research Computing Information Technology, University of South Florida 4202 E. Fowler Ave. SVC4010 Office Phone: +1 813 974-1467 Organiz

Re: [gridengine users] User access control via LDAP

2013-04-14 Thread Brian Smith
I use this to replicate my LDAP groups from FreeIPA into SGE ACLs. Seems to work pretty well so far. Runs as a cronjob every minute or so. Depending on your directory server, you may have to change the members line to reflect the way your ds handles groups (memberlist, uniquemember, etc.). It doe

Re: [gridengine users] recommended configuration for hybrib mpi+openmp jobs?

2013-07-08 Thread Brian Smith
gt;> >> >> > > ___ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users > > -- Brian Smith Assistant Director Research Computing, University of South Florida 4202 E. Fowl