ady had some success with beta testers,
greatly simplifying their submit scripts (for more complex cases) and
shortening our documentation :)
The project page is at: https://github.com/brichsmith/gepetools
Suggestions for added features would be appreciated!
Brian Smith
Senior Systems Admin
On 02/02/2012 11:52 AM, Mark Dixon wrote:
On Wed, 1 Feb 2012, Brian Smith wrote:
I've started a github page for some tools I've put together from various
bits of code, how-tos, etc. to simplify the setup of parallel
environments so that they work universally for all MPI implement
. Would it work?
-Brian
Brian Smith
Sr. System Administrator
Research Computing, University of South Florida
4202 E. Fowler Ave. SVC4010
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu
___
users mailing list
users@gridengin
est file to pull off the desired behavior, and keep only
a small set of PEs available.
-Brian
Brian Smith
Sr. System Administrator
Research Computing, University of South Florida
4202 E. Fowler Ave. SVC4010
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu
On 07/11/2012 04:28 P
this file represents the decision tree that is generated every
schedule interval, so I assume every job should be mentioned. Is this
not the case?
Any insight is appreciated.
-Brian
Brian Smith
Sr. System Administrator
Research Computing, University of South Florida
4202 E. Fowler Ave. SVC4
What say other GridEngine gurus about this approach? I believe this
will help with my resource reservation woes and at the very least,
should make my scheduler iterations much shorter. Is there a better
way? Are there any potential pitfalls I may have missed?
Any
nyone (outside of Sun/Oracle/Univa-internal spent
time optimizing the queue configuration with some emphasis on schedule
iteration performance? Seems like a neat area.
Thanks,
-Brian
Brian Smith
Sr. System Administrator
Research Computing, University of South Florida
4202 E. Fowler Ave. SVC4010
O
ce improvement.
Thanks to everyone who provided input on this. I'll post the results of my
resource reservation tests also.
-Brian
On Fri, Aug 17, 2012 at 3:33 PM, Stuart Barkley wrote:
> On Thu, 16 Aug 2012 at 12:07 -, Brian Smith wrote:
>
> > {
> >name
y
>>>
>>> However, the job arrays under job #2427 keep PE job #2427 from running.
>>>
>>> What other settings do I need to check for to fix this?
>>>
>>> I am suing binary GE2011.11
>>>
>>> Thanks,
>>> Joseph
>>&
tatus tag. I'm going to have one of my guys
implement that here. That should help eek out the extra stability we're
looking for.
Thanks,
-Brian
On Fri, Aug 17, 2012 at 4:28 AM, William Hay wrote:
> On 16 August 2012 17:07, Brian Smith wrote:
>
>
> > I want to ditch t
):
3% > 128 cores
7% 64-128 cores
15% 32-64 cores
20% 16-32 cores
25% 8-16 cores
18% 2-8 cores
12% 1 core
I can crank the value up to 512 and my scheduler intervals are still
small. I have ~450 nodes and ~4500 cpus.
-Brian
--
Brian Smith
Sr. System Administrator
Research Computing, Univers
ntage points of
utilization :)
Thanks for the link to the design documents. They'll be very helpful.
-Brian
Brian Smith
Sr. System Administrator
Research Computing, University of South Florida
4202 E. Fowler Ave. SVC4010
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu
On
you should
be looking to schedule your memory usage accordingly. Oom killer
shouldn't be a factor if memory is handled as a scheduler consideration.
-Brian
Brian Smith
Sr. System Administrator
Research Computing, University of South Florida
4202 E. Fowler Ave. SVC4010
Office Phone:
tasks (qrsh), it generally eats up more VMEM than the slave tasks
do. It was just hard to get right.
-Brian
Brian Smith
Sr. System Administrator
Research Computing, University of South Florida
4202 E. Fowler Ave. SVC4010
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu
On 08/29
I'm using mem_free without issue w/ 8.1.1. What is the output of
qconf -se global
Is mem_free in the report_variables parameter?
-Brian
On 08/28/2012 07:37 PM, Joseph Farran wrote:
Hi Reuti.
Here it is with the additional info:
$ qrsh -w v -q bio -l mem_free=190G
Job 1637 (-l h_rt=604800,m
ory use in order to make
the job run.
Reuti, have you dealt with this problem? Brian, could you share the
memkiller script you use?
Thanks,
Peter
On 08/29/2012 06:09 PM, Brian Smith wrote:
We found h_vmem to be highly unpredictable, especially with java-based
applications. Stack settings were
t -d'=' -f2)
((mem-=extraSpace))
java -mx${mem} myClass
...
This stackoverflow piece seems to cover it nicely:
http://stackoverflow.com/questions/7412619/what-are-the-advantages-of-specifiying-memory-limit-of-java-virtual-machine
-Brian
Brian Smith
Sr. System Administrator
Research Com
this particular issue. Is that even possible?
-Brian
On Thu, Aug 30, 2012 at 5:31 PM, Dave Love wrote:
> Brian Smith writes:
>
> > I have a mix of high-throughput and long wait jobs. We classify and
> > prioritize jobs based on runtime. We use a jsv to set
> >
> >
Actually, I'm wrong. That isn't the behavior I'm seeing (blocking due to
reserving the complex devel-short-medium-long-xlong). Scratch that.
On Thu, Aug 30, 2012 at 8:20 PM, Brian Smith wrote:
> I think that the devel-...-xlong parameters being consumable is causing
> unf
s has been asked/covered before.
-Brian
--
Brian Smith
Sr. System Administrator
Research Computing, University of South Florida
4202 E. Fowler Ave. SVC4010
Office Phone: +1 813 974-1467
Organization URL: http://rc.usf.edu
___
users mailing list
users@grid
he mis-reporting of memory in 6.2u5)?
>
> Correct. - Reuti
>
> >
> > --
> > Community Grid Engine: http://arc.liv.ac.uk/SGE/
>
>
> ___
> users mailing list
> users@gridengine.org
o the
complex_values list for my exec hosts, we successfully worked-around the
issue. I'm working on getting a virtualized development environment
online to test against. Once I do, I'll provide more output on the problem.
-Brian
Brian Smith
Assistant Director, Research Computing
Informa
ny ideas?
Lars
___
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users
--
Brian Smith
Assistant Director, Research Computing
Information Technology, University of South Florida
4202 E. Fowler Ave. SVC4010
Office Phone: +1 813 974-1467
Organiz
I use this to replicate my LDAP groups from FreeIPA into SGE ACLs.
Seems to work pretty well so far. Runs as a cronjob every minute or so.
Depending on your directory server, you may have to change the members
line to reflect the way your ds handles groups (memberlist,
uniquemember, etc.). It doe
gt;>
>>
>>
>
> ___
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users
>
>
--
Brian Smith
Assistant Director
Research Computing, University of South Florida
4202 E. Fowl
25 matches
Mail list logo