On 1/29/19 11:15 AM, Skylar Thompson wrote:
Hi John,
Have you looked at using the ${SGE_ROOT}/util/logchecker.sh script? There's
documentation on setting it up in doc/logfile-trimming.asc.
Hi Skylar,
I hadn't really looked at that file because the accounting
file is not a "log" file in the
The gridengine accounting file on our cluster has gotten
rather large. I have looked around in the Gridengine docs
for information on how to close it and start another file
but if it is there, I missed it.
Does anyone know how to do this?
--
JY
--
Hello,
We (fairly) recently upgraded our cluster to Rocks 6.1.1
and we now seem to be having problems with RQS. On our old
cluster, we had an RQS quota set as follows:
{
name host-slots
description restrict slots to core count
enabled TRUE
limithosts {*} to s
On 07/10/2012 04:30 PM, Rayson Ho wrote:
On Tue, Jul 10, 2012 at 4:23 PM, John Young wrote:
With this in place, it seems odd that from one of my queues I
get a default setting for the number of descriptors of 1024.
So I have two questions really:
1. Why am I getting different behavior from
On 07/10/2012 04:14 PM, Rayson Ho wrote:
On Tue, Jul 10, 2012 at 4:02 PM, John Young wrote:
If you really have a real use-case for setting the # of descriptors in
the queue config, then let us know and we can implement that in OGS/GE
(... when time permits).
Well... I have an engineer here
On 07/10/2012 03:47 PM, Rayson Ho wrote:
The number of file descriptors is not part of the queue limit, see the
message I sent to the list 2 months ago:
http://gridengine.org/pipermail/users/2012-May/003705.html
If you really have a real use-case for setting the # of descriptors in
the queue co
I have a short test job that I can submit to different
queues on my cluster that appear to be configured the
same, but I get different results. Here is the job:
---
#!/bin/tcsh
#
#$ -N show-limits
#$ -S /bin/tcsh
#$ -o show-limits.out
#$ -e show-limits.err
If I wanted to be sure that a particular parallel job ran only
on a single client node (as opposed to being spread across
multiple nodes in our cluster), is there a good way to do that
under gridengine? To be clear, I don't care which client is
used, but I want the parallel job to run on a single
One of the engineers here is having problems with any job
that tries to use more than 1024 cores. His csh script is
getting a 'Too many open files' error, so I tried raising
the descriptors limit in the shell from 1024 to 65535.
That seems to have worked for interactive logins, but not
for grideng
We have a heterogeneous grid with two types of execution hosts --
some newer nodes with 32 cores and 2Gb of memory per core and
some older nodes with two cores and 2-to-4 Gb of memory per core.
We have an engineer who submits parallel jobs using MPICH2 to the
grid. While we have not yet figured o
OK. I don't know if this is the *best* solution, but it does
seem to work.
I noticed (by putting commands in Rocks' client customiztion
file, extend-compute.xml) that the spool directory was there
but (at least during this part of the install) was owned by
root:
/bin/ls -ld /opt/gridengine/defau
On 05/25/2011 02:10 PM, Jonathan Pierce wrote:
> Assuming you're installing SGE at the same time as the OS, you should be
> able to rename it during post-config, which happens before the node first
> boots (see:
> http://www.rocksclusters.org/roll-documentation/base/5.4/customization-postconfig
On 05/25/2011 11:03 AM, Steffen Neumann wrote:
> On Wed, 2011-05-25 at 09:30 -0400, John Young wrote:
> ...
>> I can manually start sgeexecd and it comes up
> ...
>> I have looked around for some log that might give me a clue what is
>> happening, but so far I have n
I'm not really sure if this is a rocks problem or a gridengine
problem.
After my clients do a reinstall, sgeexecd is not starting. At first,
it wasn't even appearing in the chkconfig list, so I tried putting a
line in extend-compute.xml to force it "on". That seems to have worked,
as now if I lo
Under Torque, one could assign arbitrary attributes to nodes
and then create queues that required those attributes. Then
a user job that was submitted to that queue was guaranteed to
run on one of the nodes that had that attribute.
I am trying to do roughly the same thing with gridengine. I
want
On 04/27/2011 01:26 PM, Rayson Ho wrote:
> This one?
>
> http://gridscheduler.sourceforge.net/howto/mpich2-integration/mpich2-integration.html
>
> Rayson
That looks like it -- thanks!
JY
___
users mailing list
users@gridengine.org
https://gridengine.o
I have seen references on the net to a paper by Reuti on the
tight integration of MPICH2 with gridengine, but all of the
sources that I have found so far seem to just land me on Oracle's
home page. :-/ Can anyone point to a copy of this paper
on a machine not owned by Oracle?
JY
17 matches
Mail list logo