Re: [gridengine users] starting a new gridengine accounting file

2019-01-29 Thread John Young
On 1/29/19 11:15 AM, Skylar Thompson wrote: Hi John, Have you looked at using the ${SGE_ROOT}/util/logchecker.sh script? There's documentation on setting it up in doc/logfile-trimming.asc. Hi Skylar, I hadn't really looked at that file because the accounting file is not a "log" file in the

[gridengine users] starting a new gridengine accounting file

2019-01-29 Thread John Young
The gridengine accounting file on our cluster has gotten rather large. I have looked around in the Gridengine docs for information on how to close it and start another file but if it is there, I missed it. Does anyone know how to do this? -- JY --

[gridengine users] limit slots to core count no longer works

2015-04-14 Thread John Young
Hello, We (fairly) recently upgraded our cluster to Rocks 6.1.1 and we now seem to be having problems with RQS. On our old cluster, we had an RQS quota set as follows: { name host-slots description restrict slots to core count enabled TRUE limithosts {*} to s

Re: [gridengine users] queues behaving differently

2012-07-11 Thread John Young
On 07/10/2012 04:30 PM, Rayson Ho wrote: On Tue, Jul 10, 2012 at 4:23 PM, John Young wrote: With this in place, it seems odd that from one of my queues I get a default setting for the number of descriptors of 1024. So I have two questions really: 1. Why am I getting different behavior from

Re: [gridengine users] queues behaving differently

2012-07-10 Thread John Young
On 07/10/2012 04:14 PM, Rayson Ho wrote: On Tue, Jul 10, 2012 at 4:02 PM, John Young wrote: If you really have a real use-case for setting the # of descriptors in the queue config, then let us know and we can implement that in OGS/GE (... when time permits). Well... I have an engineer here

Re: [gridengine users] queues behaving differently

2012-07-10 Thread John Young
On 07/10/2012 03:47 PM, Rayson Ho wrote: The number of file descriptors is not part of the queue limit, see the message I sent to the list 2 months ago: http://gridengine.org/pipermail/users/2012-May/003705.html If you really have a real use-case for setting the # of descriptors in the queue co

[gridengine users] queues behaving differently

2012-07-10 Thread John Young
I have a short test job that I can submit to different queues on my cluster that appear to be configured the same, but I get different results. Here is the job: --- #!/bin/tcsh # #$ -N show-limits #$ -S /bin/tcsh #$ -o show-limits.out #$ -e show-limits.err

[gridengine users] Can I force a parallel job to be run on a single client?

2012-05-25 Thread John Young
If I wanted to be sure that a particular parallel job ran only on a single client node (as opposed to being spread across multiple nodes in our cluster), is there a good way to do that under gridengine? To be clear, I don't care which client is used, but I want the parallel job to run on a single

[gridengine users] gridengine overriding shell environment?

2011-06-03 Thread John Young
One of the engineers here is having problems with any job that tries to use more than 1024 cores. His csh script is getting a 'Too many open files' error, so I tried raising the descriptors limit in the shell from 1024 to 65535. That seems to have worked for interactive logins, but not for grideng

[gridengine users] controlling where an MPICH2 job is started from?

2011-06-01 Thread John Young
We have a heterogeneous grid with two types of execution hosts -- some newer nodes with 32 cores and 2Gb of memory per core and some older nodes with two cores and 2-to-4 Gb of memory per core. We have an engineer who submits parallel jobs using MPICH2 to the grid. While we have not yet figured o

Re: [gridengine users] sgeexecd not starting after reinstall on Rocks clients

2011-05-25 Thread John Young
OK. I don't know if this is the *best* solution, but it does seem to work. I noticed (by putting commands in Rocks' client customiztion file, extend-compute.xml) that the spool directory was there but (at least during this part of the install) was owned by root: /bin/ls -ld /opt/gridengine/defau

Re: [gridengine users] sgeexecd not starting after reinstall on Rocks clients

2011-05-25 Thread John Young
On 05/25/2011 02:10 PM, Jonathan Pierce wrote: > Assuming you're installing SGE at the same time as the OS, you should be > able to rename it during post-config, which happens before the node first > boots (see: > http://www.rocksclusters.org/roll-documentation/base/5.4/customization-postconfig

Re: [gridengine users] sgeexecd not starting after reinstall on Rocks clients

2011-05-25 Thread John Young
On 05/25/2011 11:03 AM, Steffen Neumann wrote: > On Wed, 2011-05-25 at 09:30 -0400, John Young wrote: > ... >> I can manually start sgeexecd and it comes up > ... >> I have looked around for some log that might give me a clue what is >> happening, but so far I have n

[gridengine users] sgeexecd not starting after reinstall on Rocks clients

2011-05-25 Thread John Young
I'm not really sure if this is a rocks problem or a gridengine problem. After my clients do a reinstall, sgeexecd is not starting. At first, it wasn't even appearing in the chkconfig list, so I tried putting a line in extend-compute.xml to force it "on". That seems to have worked, as now if I lo

[gridengine users] assign attributes to nodes?

2011-05-18 Thread John Young
Under Torque, one could assign arbitrary attributes to nodes and then create queues that required those attributes. Then a user job that was submitted to that queue was guaranteed to run on one of the nodes that had that attribute. I am trying to do roughly the same thing with gridengine. I want

Re: [gridengine users] tight integration of MPICH2 paper?

2011-04-27 Thread John Young
On 04/27/2011 01:26 PM, Rayson Ho wrote: > This one? > > http://gridscheduler.sourceforge.net/howto/mpich2-integration/mpich2-integration.html > > Rayson That looks like it -- thanks! JY ___ users mailing list users@gridengine.org https://gridengine.o

[gridengine users] tight integration of MPICH2 paper?

2011-04-27 Thread John Young
I have seen references on the net to a paper by Reuti on the tight integration of MPICH2 with gridengine, but all of the sources that I have found so far seem to just land me on Oracle's home page. :-/ Can anyone point to a copy of this paper on a machine not owned by Oracle? JY