Ole Holm Nielsen
> wrote:
>
> Hi Hoot,
>
> I'm glad that you have figured out that GrpTRESMins is working as documented
> and kills running jobs when the limit is exceeded. This would only occur if
> you lower the GrpTRESMins limit after a job has started.
>
> /Ole
&
canceled.
Thanks for your help.
> On Apr 24, 2023, at 1:55 PM, Ole Holm Nielsen
> wrote:
>
> On 24-04-2023 18:33, Hoot Thompson wrote:
>> In my reading of the Slurm documentation, it seems that exceeding the limits
>> set in GrpTRESMins should result in terminatin
Is there a mechanism for terminating an active job based on a predetermined
threshold - CPU usage, budget, etc? GrpTRESMins reads like it should work but
in practice it doesn’t seem to.
Thanks in advance!
So Ole, any thoughts on the config info I sent?
I’m still not certain if terminating a running job based on GrpTRESMins is even
possible or supposed to work.
Hoot
> On Apr 24, 2023, at 3:21 PM, Hoot Thompson wrote:
>
> See below…...
>
>> On Apr 24, 2023, at 1:55 PM
See below…...
> On Apr 24, 2023, at 1:55 PM, Ole Holm Nielsen
> wrote:
>
> On 24-04-2023 18:33, Hoot Thompson wrote:
>> In my reading of the Slurm documentation, it seems that exceeding the limits
>> set in GrpTRESMins should result in terminating a running job.
In my reading of the Slurm documentation, it seems that exceeding the limits
set in GrpTRESMins should result in terminating a running job. However, in
testing this, The ‘current value’ of the GrpTRESMins only updates upon job
completion and is not updated as the job progresses. Therefore jobs
nts can be so different, and because Slurm is a fantastic
> software that can be configured for many different scenarios, IMHO a support
> contract with SchedMD is the best way to get consulting services, get general
> help, and report bugs. We have excellent experiences with SchedM
So an update, GrpTRES registers a value while a job is running but GRpTRESMins
does not. So I still have something wrong. GrpTRESMins reads in the docs like
it is in fact an aggregate number.
> On Apr 20, 2023, at 1:01 PM, Ole Holm Nielsen
> wrote:
>
> On 20-04-2023 18:23, H
And it indeed does show current value for a running job!! Do I feel stupid :-)
> On Apr 20, 2023, at 1:01 PM, Ole Holm Nielsen
> wrote:
>
> On 20-04-2023 18:23, Hoot Thompson wrote:
>> Ole,
>> Earlier I found your Slurm_tools posting and found it very useful. Thi
Ah, I thought that was the aggregate of past and current jobs.
> On Apr 20, 2023, at 1:01 PM, Ole Holm Nielsen
> wrote:
>
> On 20-04-2023 18:23, Hoot Thompson wrote:
>> Ole,
>> Earlier I found your Slurm_tools posting and found it very useful. This
>> rema
ubuntu 1 0
0.00 0.00
Clearly I’m still missing something or I don’t understand how it’s supposed to
work.
Hoot
> On Apr 20, 2023, at 2:10 AM, Ole Holm Nielsen
> wrote:
>
> Hi Hoot,
>
> On 4/20/23 00:15, Hoot Thompson wrote:
Thank you for this. I’ll give it a read but no promises that I won’t be back
with more questions!
Hoot
> On Apr 20, 2023, at 2:10 AM, Ole Holm Nielsen
> wrote:
>
> Hi Hoot,
>
> On 4/20/23 00:15, Hoot Thompson wrote:
>> Is there a ‘how to’ or recipe document for
Is there a ‘how to’ or recipe document for setting up and enforcing resource
limits? I can establish accounts, users, and set limits but 'current value' is
not incrementing after running jobs.
Thanks in advance
I'm running slurm on an AWS cluster and there seems to be a race
condition whereby the PrologSlurmctld script runs on occasion when
compute nodes try to transition to CG from P but fall back to P when
nodes are not available. Have you seen this behavior and is there a way
to prevent it? I'm
Can the prolog script be configured to only run on a cluster head node
as opposed to compute nodes?
Thank you for the support. I will be back with any additional questions.
BTW, if it changes or adds to your thoughts, I'm working in AWS on a
parallelcluster.
Hoot
On 1/21/22 4:12 AM, Ole Holm Nielsen wrote:
On 1/21/22 10:05, Diego Zuccato wrote:
Il 21/01/2022 07:51, Ole Holm Nielsen ha
How do you change the default memory per node from the current 1MB to
something much higher?
Thanks in advance.
*ubuntu@node*:*/shared*$ sinfo -o "%20N%10c%10m%25f%10G "
NODELISTCPUSMEMORYAVAIL_FEATURES GRES
hpc-demand-dy-c5n18x361 dynamic,c5n.18xlarge,c5n1(null)
Ok, a fresh start after installing the two recommended packages and things
appear to be working. Thank for the help!
On 9/23/21, 3:04 PM, "slurm-users on behalf of Hoot Thompson"
wrote:
Do I need to specify the json path in the configure process?
On 9/23/21, 2:45 PM, &q
Do I need to specify the json path in the configure process?
On 9/23/21, 2:45 PM, "slurm-users on behalf of Hoot Thompson"
wrote:
If this useful, note that there's no attempt to build anything in the
serializer/json directory.
Making all in serializer
make[4]
make[5]: Leaving directory '/home/ubuntu/slurm-21.08.1/src/plugins/serializer'
make[4]: Leaving directory '/home/ubuntu/slurm-21.08.1/src/plugins/serializer'
Making all in site_factor
make[4]: Entering directory '/home/ubuntu/slurm-21.08.1/src/plugins/site_factor'
On 9/23/21, 2:24 PM, "slu
What's getting built is
serializer_url_encoded.a
serializer_url_encoded.la
serializer_url_encoded.so
if this helps.
On 9/23/21, 2:10 PM, "slurm-users on behalf of Hoot Thompson"
wrote:
On Ubuntu 20.04 I installed ...
libjson-c-dev
Libhttp-parser-dev
That wo
On Ubuntu 20.04 I installed ...
libjson-c-dev
Libhttp-parser-dev
That work? No joy if so.
On 9/23/21, 1:30 PM, "slurm-users on behalf of Ole Holm Nielsen"
wrote:
On 23-09-2021 16:01, Hoot Thompson wrote:
> In upgrading to 21.08.1, slurmctld status reports:
>
In upgrading to 21.08.1, slurmctld status reports:
Sep 23 13:49:52 ip-10-10-7-17 systemd[1]: Started Slurm controller daemon.
Sep 23 13:49:52 ip-10-10-7-17 slurmctld[1323]: fatal: Unable to find plugin:
serializer/json
Sep 23 13:49:52 ip-10-10-7-17 systemd[1]: slurmctld.service: Main
I have the REST API basically working but I am having a problem with job
submission syntax. The error I receive is ‘Unable to parse query”. I have
followed the guides found on-line to no avail. Is there somewhere to look for
what the issue may be?
erver/administration/performance-tuning/role/remote-desktop/session-hosts#remote-desktop-session-host-tuning-parameters).
Set the "Performance Options" in Control Panel > System to "Adjust for best
performance."
[image.png]
On Thu, Jan 30, 2020 at 5:14 PM Hoot Thompson
mailt
We’ve inherited a somewhat old VCL cluster where the common complaint is poor
graphics over RDP for such things as Matlab. What are the VCL best practices
for obtaining good graphics performance?
Thanks in advance!
Working on a stock CentOS 7 system and running the red5.sh script, the
sendmail portion seems to be in a loop
WARN 08-15 14:42:44.467 ConfigurationDao.java 181704 118
org.apache.openmeetings.db.dao.basic.ConfigurationDao
[org.springframework.scheduling.quartz.SchedulerFactoryBean#0_Worker-7]
I'm working on a group of Windows Server 2012 R2 systems. Prior to
joining the systems to our domain controller, I install the openssh
components after which I can ssh into the system using the Administrator
account and password. However, as soon as I join a server to our domain
controller, ssh no
When trying to launch a Centos6.5 VM (pulled from
ovirt-image-repository) onto a Westmere (Dell 6100)
host through oVirt, the following error occurs from
libvirt:
3312: warning : x86Decode:1517 : Preferred CPU model Westmere not
allowed by hypervisor; closest supported model will be used
3312:
Is there a basic howto in terms of constructing a DebianGis install?
Thanks in advance!
--
To UNSUBSCRIBE, email to debian-gis-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/544a5f8f.2050...@ptpnow.com
Is there a basic howto in terms of constructing a DebianGis install?
Thanks in advance!
--
To UNSUBSCRIBE, email to debian-gis-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: https://lists.debian.org/544a5b69.7050...@nasa.gov
What's the easiest way to get details as to file sizes, blocking, etc.
of data stored in a hdfs file system. I'm really interested in how files
are broken up as they are put into hdfs.
Thanks.
:24 PM, Hoot Thompson h...@ptpnow.com wrote:
What's the easiest way to get details as to file sizes, blocking, etc.
of data stored in a hdfs file system. I'm really interested in how files
are broken up as they are put into hdfs.
Thanks.
--
Thanks Regards
Manu S
SI Engineer
: Re: CFTA-UTF8 warning message
Sure, people on the mailing list may help as well.
regards,
Eric
On Sat, Feb 18, 2012 at 10:51 AM, Hoot Thompson h...@ptpnow.com wrote:
Thank you for the response. I guess I should start at the beginning. I
loaded chukwa as part of trying to get Intel's HiTune
Trying to build version 0.5.0 from source, everything looks good with
the exception of the following error. What does it mean?
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.15 sec
Results :
Failed tests:
I've been using hadoop/mapreduce for awhile and now I want to add HiTune
to analyze my performance. Step one is to get chukwa up and running. I'm
using hadoop 1.0 and chukwa 0.4.0. I can't seem to get a collector to
start. Here's the output.
hadoop@hadoop-nfs:/usr/local/other/chukwa/bin$
I'm trying to prove that my cluster will in fact support multiple reducers,
the wordcount example doesn't seem to spawn more that one (1). Is that
correct? Is there a sure fire way to prove my cluster is configured
correctly in terms of launching the maximum (say two per node) number of
mappers
at 8:03 PM, Hoot Thompson h...@ptpnow.com wrote:
I'm trying to prove that my cluster will in fact support multiple
reducers,
the wordcount example doesn't seem to spawn more that one (1). Is that
correct? Is there a sure fire way to prove my cluster is configured
correctly in terms
I can't seem to get past this heapsize error. Relevant settings are as
follows:
# The maximum amount of heap to use, in MB. Default is 1000.
export HADOOP_HEAPSIZE=1
namemapred.child.java.opts/name
value-Xmx1048576m/value
/property
11/11/12 17:38:27 INFO hpc.Driver: Jar
Name:
: (33)-1 39 63 56 74
Email: riadh.t...@inria.fr http://inria.fr/
Home page: http://www-rocq.inria.fr/who/Mohamed.Trad/
http://www-rocq.inria.fr/%7Etrad/
Le 15 nov. 2011 à 00:39, Hoot Thompson a écrit :
Re: Mapreduce heap size error
Still issues, around 2300 unique files
hadoop
School of International Management
Office: 11-15
Phone: (33)-1 39 63 59 33
Fax: (33)-1 39 63 56 74
Email: riadh.t...@inria.fr http://inria.fr/
Home page: http://www-rocq.inria.fr/who/Mohamed.Trad/
http://www-rocq.inria.fr/%7Etrad/
Le 15 nov. 2011 à 00:39, Hoot Thompson a écrit :
Re
Any suggestions as to how to track down the root cause of these errors?
1178709 [main] INFO org.apache.hadoop.mapred.JobClient - map 6% reduce 0%
1178709 [main] INFO org.apache.hadoop.mapred.JobClient - map 6% reduce 0%
11/11/15 00:45:29 INFO mapred.JobClient: Task Id :
Still issues, around 2300 unique files
hadoop@lobster-nfs:~/querry$ hadoop jar HadoopTest.jar -D
mapred.child.java.opts=-Xmx4096M
hdfs://lobster-nfs:9000/hadoop_fs/dfs/merra/seq_out
/hadoop_fs/dfs/output/test_14_r2.out
11/11/15 01:56:20 INFO hpc.Driver: Jar Name:
Can¹t seem to get past this heap size error, any ideas where to look? Below
are my heap size settings, at least the ones I attempted to increase.
Thanks in advance for any thoughts.
# The maximum amount of heap to use, in MB. Default is 1000.
export HADOOP_HEAPSIZE=1
I will try NetPIPE or similar
On 8/16/11 9:01 AM, "Jeff Squyres" <jsquy...@cisco.com> wrote:
> Are you able to run other TCP-based applications between the two VM's, such as
> the TCP version of NetPIPE?
>
>
> On Aug 15, 2011, at 10:57 PM, Hoot Thompson wrote:
thanks
On 5/8/11 7:55 AM, Joseph Glanville joseph.glanvi...@orionvm.com.au
wrote:
Hi,
XCP questions are usually answered on the xen-api list.
I am cc'ing the list in on this so that someone might answer it for you.
Joseph.
On 7 May 2011 08:12, Hoot Thompson h...@ptpnow.com wrote
A general question, does gridFTP use direct or buffered IO when accessing
disk?
Hoot
_
From: Raj Kettimuthu [mailto:ketti...@mcs.anl.gov]
Sent: Tuesday, September 14, 2010 3:31 PM
To: Hoot Thompson
Cc: gt-user@lists.globus.org
Subject: Re: Call
On Sep 14, 2010, at 8:10 AM, Hoot
test from GSFC (Greenbelt, MD) to the
SC10 floor in New Orleans which leads to another question, how is
gridFTP impacted by increasing rtts?.
Thanks again for all the help,
Hoot
'-Original Message-
From: Raj Kettimuthu ketti...@mcs.anl.gov
To: Hoot Thompson h...@ptpnow.com
Cc: gt-user
on
the attachment are the three gridftp-servers I have running on each
server. Are they correct? Does the command line you gave below still
apply?
Hoot
-Original Message-
From: Raj Kettimuthu ketti...@mcs.anl.gov
To: Hoot Thompson h...@ptpnow.com
Subject: Re: Call
Date: Thu, 9 Sep 2010 17:04
I've been doing that as sudo.
-Original Message-
From: Michael Link [mailto:ml...@mcs.anl.gov]
Sent: Tuesday, August 31, 2010 3:44 PM
To: Hoot Thompson
Cc: Prakash Velayutham; gt-user@lists.globus.org
Subject: Re: [gt-user] Stripe mode over multiple links between two servers
You would
: Hoot Thompson h...@ptpnow.com
Subject: RE: [gt-user] Stripe mode over multiple links between two servers
Date: Tue, 24 Aug 2010 14:58:39 -0400
Ok. Just to repeat in my own words, two servers with two interfaces each
can be striped if GSI is use.
Yes. I'd expect you'd end up running three GridFTP
-
From: Martin Feller fel...@mcs.anl.gov
To: Hoot Thompson h...@ptpnow.com
Cc: gt-user@lists.globus.org
Subject: Re: [gt-user] Stripe mode over multiple links between two
servers
Date: Fri, 27 Aug 2010 07:04:53 -0500
The CA itself should stay on one machine and should not be copied to
multiple nodes
All the pieces seem to be basically working and I'm able to transfer
data between my two test servers using ssh authentication. However, my
servers are multi-homed with both GigE and 10 GigE links. My goal is
get data transferred over the 10 GIgE link but even though the initial
authentication
I have two servers, each with two 10GigE links and I would like to
stripe a file across the two links. I'm currently authenticating using
ssh. Can I do this using the gridftp server stripe mode and if so, how
do I set it up?
Thanks!
-user] Stripe mode over multiple links between two servers
From: Hoot Thompson h...@ptpnow.com
Subject: [gt-user] Stripe mode over multiple links between two servers
Date: Tue, 24 Aug 2010 14:03:39 -0400
I have two servers, each with two 10GigE links and I would like to
stripe a file across
multiple links between two
servers
Date: Tue, 24 Aug 2010 14:48:48 -0500 (CDT)
From: Hoot Thompson h...@ptpnow.com
Subject: RE: [gt-user] Stripe mode over multiple links between two servers
Date: Tue, 24 Aug 2010 14:58:39 -0400
Ok. Just to repeat in my own words, two servers with two interfaces each
] compiling 2.7 on suse 11
Operating system? Version of the compiler? Your runtime library isn't
supplying this function as it stands basically.
Jim
-Original Message-
From: antlr-interest-boun...@antlr.org [mailto:antlr-interest-
boun...@antlr.org] On Behalf Of Hoot Thompson
Sent: Monday
Anybody seen/having this problem...
/usr/bin/make -C lib/cpp/src all
make[3]: Entering directory `/root/antlr-2.7.7/lib/cpp/src'
***
compiling /root/antlr-2.7.7/lib/cpp/src/../../../lib/cpp/src/ANTLRUtil.cpp
***
compiling
Will the srp module be available in the OFED 1.5 release? If so, when?
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit
Initiator
Thanks for the quick response.
-Original Message-
From: Bart Van Assche [mailto:bart.vanass...@gmail.com]
Sent: Monday, September 28, 2009 7:54 AM
To: Hoot Thompson
Cc: general@lists.openfabrics.org
Subject: Re: [ofa-general] Srp in OFED 1.5
On Mon, Sep 28, 2009 at 1:35
Thanks for the feedback.
Hoot
-Original Message-
From: Bart Van Assche [mailto:bart.vanass...@gmail.com]
Sent: Monday, September 28, 2009 12:39 PM
To: Hoot Thompson
Cc: general@lists.openfabrics.org
Subject: Re: [ofa-general] Srp in OFED 1.5
On Mon, Sep 28, 2009 at 2:06 PM, Hoot
Will the srp module be included in the OFED 1.5 release? If so, when?
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit
I've seen other postings noting a similar error but have not seen a
resolution. When trying to load the ib_ipoib module I get the following
error. ib_ipoib: Unknown symbol icmpv6_send. How do I clear this error?
It's a SuSE 10 SP1 system.
Thanks
database using the UNH driver. Later on however, the LUNs are recognized as
generic scsi devices as opposed to scsi disks. It was suggested that
perhaps the problem is in the scsi layer. BTW a Ciprico RAID attached to
the fabric is seen by the Linux machines.
Any suggestions are welcome,
Hoot
64 matches
Mail list logo