[slurm-dev] Re: Prune database before migration to 14.11 ?

2014-10-21 Thread Christopher Samuel

On 16/10/14 16:02, Christopher Samuel wrote:

> No worries, we're going to test out ours in a sandbox as well, so we'll
> be able to compare it to our (pretty beefy) DB servers.

It took around 2 minutes to add all the indexes in our sandbox, thats
with a total of about 6 million jobs across 5 systems (of which 2 are
now decommissioned).

We're about to do it for real.

All the best,
Chris
-- 
 Christopher SamuelSenior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/  http://twitter.com/vlsci


[slurm-dev] Re: Unable to start slurmdbd after system crash

2014-10-21 Thread Trey Dockendorf
Sorry for the noise, I had never seen this error before so figured it was
SLURM specific, but it's not.  Been long day and should have googled it
first :)

mysql mysql -e "REPAIR TABLE proc" resolved the issue.

- Trey

=

Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email: treyd...@tamu.edu
Jabber: treyd...@tamu.edu

On Tue, Oct 21, 2014 at 10:15 PM, Trey Dockendorf  wrote:

> Yesterday the VMs running slurmctld, slurmdbd and MySQL all crashed due to
> issues with storage networks.  I am now unable to start slurmdbd.  The
> error I get is below.  The part about "is marked as crashed and should be
> repaired", is that coming from MySQL or from SLURM?  Any advice on how to
> proceed is appreciated.
>
> Thanks,
> - Trey
>
> [2014-10-21T22:12:18.849] error: mysql_query failed: 145 Table
> './mysql/proc' is marked as crashed and should be repaired
> drop procedure if exists get_parent_limits; create procedure
> get_parent_limits(my_table text, acct text, cluster text, without_limits
> int) begin set @par_id = NULL; set @mj = NULL; set @msj = NULL; set @mcpj =
> NULL; set @mnpj = NULL; set @mwpj = NULL; set @mcmpj = NULL; set @mcrm =
> NULL; set @def_qos_id = NULL; set @qos = ''; set @delta_qos = ''; set
> @my_acct = acct; if without_limits then set @mj = 0; set @msj = 0; set
> @mcpj = 0; set @mnpj = 0; set @mwpj = 0; set @mcmpj = 0; set @mcrm = 0; set
> @def_qos_id = 0; set @qos = 0; set @delta_qos = 0; end if; REPEAT set @s =
> 'select '; if @par_id is NULL then set @s = CONCAT(@s, '@par_id :=
> id_assoc, '); end if; if @mj is NULL then set @s = CONCAT(@s, '@mj :=
> max_jobs, '); end if; if @msj is NULL then set @s = CONCAT(@s, '@msj :=
> max_submit_jobs, '); end if; if @mcpj is NULL then set @s = CONCAT(@s,
> '@mcpj := max_cpus_pj, ') ;end if; if @mnpj is NULL then set @s =
> CONCAT(@s, '@mnpj := max_nodes_pj, ') ;end if; if @mwpj is NULL then set @s
> = CONCAT(@s, '@mwpj := max_wall_pj, '); end if; if @mcmpj is NULL then set
> @s = CONCAT(@s, '@mcmpj := max_cpu_mins_pj, '); end if; if @mcrm is NULL
> then set @s = CONCAT(@s, '@mcrm := max_cpu_run_mins, '); end if; if
> @def_qos_id is NULL then set @s = CONCAT(@s, '@def_qos_id := def_qos_id,
> '); end if; if @qos = '' then set @s = CONCAT(@s, '@qos := qos, @delta_qos
> := REPLACE(CONCAT(delta_qos, @delta_qos), \',,\', \',\'), '); end if; set
> @s = concat(@s, '@my_acct := parent_acct from "', cluster, '_', my_table,
> '" where acct = \'', @my_acct, '\' && user=\'\''); prepare query from @s;
> execute query; deallocate prepare query; UNTIL (@mj != -1 && @msj != -1 &&
> @mcpj != -1 && @mnpj != -1 && @mwpj != -1 && @mcmpj != -1 && @mcrm != -1 &&
> @def_qos_id != -1 && @qos != '') || @my_acct = '' END REPEAT; END;
> [2014-10-21T22:12:18.850] error: Couldn't load specified plugin name for
> accounting_storage/mysql: Plugin init() callback failed
> [2014-10-21T22:12:18.850] error: cannot create accounting_storage context
> for accounting_storage/mysql
> [2014-10-21T22:12:18.850] fatal: Unable to initialize
> accounting_storage/mysql accounting storage plugin
>
>
>
> =
>
> Trey Dockendorf
> Systems Analyst I
> Texas A&M University
> Academy for Advanced Telecommunications and Learning Technologies
> Phone: (979)458-2396
> Email: treyd...@tamu.edu
> Jabber: treyd...@tamu.edu
>


[slurm-dev] Unable to start slurmdbd after system crash

2014-10-21 Thread Trey Dockendorf
Yesterday the VMs running slurmctld, slurmdbd and MySQL all crashed due to
issues with storage networks.  I am now unable to start slurmdbd.  The
error I get is below.  The part about "is marked as crashed and should be
repaired", is that coming from MySQL or from SLURM?  Any advice on how to
proceed is appreciated.

Thanks,
- Trey

[2014-10-21T22:12:18.849] error: mysql_query failed: 145 Table
'./mysql/proc' is marked as crashed and should be repaired
drop procedure if exists get_parent_limits; create procedure
get_parent_limits(my_table text, acct text, cluster text, without_limits
int) begin set @par_id = NULL; set @mj = NULL; set @msj = NULL; set @mcpj =
NULL; set @mnpj = NULL; set @mwpj = NULL; set @mcmpj = NULL; set @mcrm =
NULL; set @def_qos_id = NULL; set @qos = ''; set @delta_qos = ''; set
@my_acct = acct; if without_limits then set @mj = 0; set @msj = 0; set
@mcpj = 0; set @mnpj = 0; set @mwpj = 0; set @mcmpj = 0; set @mcrm = 0; set
@def_qos_id = 0; set @qos = 0; set @delta_qos = 0; end if; REPEAT set @s =
'select '; if @par_id is NULL then set @s = CONCAT(@s, '@par_id :=
id_assoc, '); end if; if @mj is NULL then set @s = CONCAT(@s, '@mj :=
max_jobs, '); end if; if @msj is NULL then set @s = CONCAT(@s, '@msj :=
max_submit_jobs, '); end if; if @mcpj is NULL then set @s = CONCAT(@s,
'@mcpj := max_cpus_pj, ') ;end if; if @mnpj is NULL then set @s =
CONCAT(@s, '@mnpj := max_nodes_pj, ') ;end if; if @mwpj is NULL then set @s
= CONCAT(@s, '@mwpj := max_wall_pj, '); end if; if @mcmpj is NULL then set
@s = CONCAT(@s, '@mcmpj := max_cpu_mins_pj, '); end if; if @mcrm is NULL
then set @s = CONCAT(@s, '@mcrm := max_cpu_run_mins, '); end if; if
@def_qos_id is NULL then set @s = CONCAT(@s, '@def_qos_id := def_qos_id,
'); end if; if @qos = '' then set @s = CONCAT(@s, '@qos := qos, @delta_qos
:= REPLACE(CONCAT(delta_qos, @delta_qos), \',,\', \',\'), '); end if; set
@s = concat(@s, '@my_acct := parent_acct from "', cluster, '_', my_table,
'" where acct = \'', @my_acct, '\' && user=\'\''); prepare query from @s;
execute query; deallocate prepare query; UNTIL (@mj != -1 && @msj != -1 &&
@mcpj != -1 && @mnpj != -1 && @mwpj != -1 && @mcmpj != -1 && @mcrm != -1 &&
@def_qos_id != -1 && @qos != '') || @my_acct = '' END REPEAT; END;
[2014-10-21T22:12:18.850] error: Couldn't load specified plugin name for
accounting_storage/mysql: Plugin init() callback failed
[2014-10-21T22:12:18.850] error: cannot create accounting_storage context
for accounting_storage/mysql
[2014-10-21T22:12:18.850] fatal: Unable to initialize
accounting_storage/mysql accounting storage plugin



=

Trey Dockendorf
Systems Analyst I
Texas A&M University
Academy for Advanced Telecommunications and Learning Technologies
Phone: (979)458-2396
Email: treyd...@tamu.edu
Jabber: treyd...@tamu.edu


[slurm-dev] Odd (ReqNodeNotAvail) and (PartitionNodeLimit) with multiple partitions

2014-10-21 Thread Mikael Johansson



Hello All,

I had a problem with jobs being stuck in the queue and not being scheduled 
even with unused cores on the cluster. The system has four partitions, 
three different "high priority" ones and one lower priority, "backfill" 
partition. A concise description of the setup in slurm.config, SLURM 
2.2.7:



PartitionName=backfill Nodes=node[001-026] Default=NO  MaxNodes=10 
MaxTime=168:00:00 AllowGroups=ALL Priority=1 DisableRootJobs=NO RootOnly=NO 
Hidden=NO Shared=NO PreemptMode=requeue
PartitionName=shortNodes=node[005-026] Default=YES MaxNodes=6  
MaxTime=002:00:00 AllowGroups=ALL Priority=2 DisableRootJobs=NO RootOnly=NO 
Hidden=NO Shared=NO PreemptMode=off
PartitionName=medium   Nodes=node[009-026] Default=NO  MaxNodes=4  
MaxTime=168:00:00 AllowGroups=ALL Priority=2 DisableRootJobs=NO RootOnly=NO 
Hidden=NO Shared=NO PreemptMode=off
PartitionName=long Nodes=node[001-004] Default=NO  MaxNodes=4  
MaxTime=744:00:00 AllowGroups=ALL Priority=2 DisableRootJobs=NO RootOnly=NO 
Hidden=NO Shared=NO PreemptMode=off

SchedulerType=sched/builtin
PreemptType=preempt/partition_prio
PreemptMode=requeue


(I'll send more of course if needed)


The problem here is that the backfill jobs will only be scheduled to run 
on nodes node[001-008]. They will never start on nodes node[009-026]. I 
tested this buy submitting a job explicitly to a specific node (node020) 
using two enforcements; both lead to the job being stuck in different, 
odd, states;


#SBATCH -w node020:
  the job gets status (ReqNodeNotAvail), and the log shows "debug2: sched:
  JobId=NN allocated resources: NodeList=(null)" and "debug3:
  JobId=NN required nodes not avail"

#SBATCH -x node[001-019,021-026]:
  the job gets status (PartitionNodeLimit), and the log shows "debug3:
  JobId=NN not runnable with present config"


I have no idea how SLURM arrives at these conclusions. In order to find 
out what's going on, the following _does_ start the jobs (but breaks the 
configuration, of course):


1. Increasing the priority of PartitionName=backfill to the same as the
   others, 2

2. Removing node020 from all other partitions


I also thought it might be somehow related to the fact that nodes 
node[009-026] are shared by three partitions (instead of just 2, like the 
other nodes), which perhaps confuses SLURM 2.2.7. Removing node020 from, 
for example, the short partition, leaving it in only medium and backfill 
does not help, though.


However, removing node020 from the medium partition, leaving it only in 
the short and backfill partitions, does work, and the job starts in 
backfill without problems.


To me this sounds like an odd bug, but perhaps I'm missing something. If 
it is a bug, and known to be fixed in later versions, that would be a good 
reason to force us to upgrade SLURM to something a bit more modern. But at 
the same time, if someone comes up with a work-around, it would actually 
at least in the short-term be a solution much easier to implement.



So again, all ideas and suggestions, or just explanations of the odd job 
states are most welcome!



Cheers,
Mikael J.
http://www.iki.fi/~mpjohans/


[slurm-dev] Re: Partition for unused resources until needed by any other partition

2014-10-21 Thread Mikael Johansson



Thanks!

That looks like something that could be useful indeed. We are for the 
moment stuck with version 2.2.7, though, and if I understood the docs 
correctly, most of the partition based parameters are of later date and 
versions. We might upgrade in some future, though.


It also seems like the scheduling problems are not directly related to the 
setup of the partitions, but rather to some odd behaviour on the part of 
SLURM. I'll write a separate post about that, as it seems it might be a 
bug.


Cheers, and thanks again,
Mikael J.
http://www.iki.fi/~mpjohans/


On Mon, 20 Oct 2014, Paul Edmon wrote:

I advise using the following SchedulerParameters, partition_job_depth, and 
bf_max_job_part.  This will force the scheduler to schedule jobs for each 
partition.  Otherwise it will take a strictly top down approach.



This is what we run:

#  default_queue_depth should be some multiple of the partition_job_depth,
#  ideally number_of_partitions * partition_job_depth.
SchedulerParameters=default_queue_depth=5700,partition_job_depth=100,bf_interval=1,bf_continue,bf_window=2880,bf_resolution=3600,bf_max_job_test=5,bf_max_job_part=5,bf_max_job_user=1,bf_max_job_start=100,max_rpc_cnt=8

These parameters work well for a cluster of 50,000 cores, 57 queues, and 
about 40,000 jobs per day.  We are running 14.03.8


-Paul Edmon-


[slurm-dev] Re: Coman equivalent slurm LSF

2014-10-21 Thread lfelipe
Dear all:

Thanks to all for your comments and advice.

Luis Felipe Ruiz Nieto


El 20/10/14 a las #4, Morris Jette escribió:
> Try srun --noalloc as root
>
> On October 20, 2014 4:43:49 AM PDT, lfelipe  wrote:
>
> Thanks for your reply.
>
> The problem is that if the machine is out of resources, the job
> does not run. With the brun command, although the machine
> resources are full, the job came running.
>
>
> El 20/10/14 a las #4, Sefa Arslan escribió:
>> sorry, I have forgotten to add;
>>
>> after increased the priority, update the job nodelist using
>> scontrol..
>>
>> scontrol update jobid=X nodelist=nodeXXX
>>
>>
>> Sorumluluk Reddi 
>> On 20/10/14 13:30, lfelipe wrote:
>>> Dear all:
>>>
>>> Does anyone know what is the equivalent command in LSF to "brun -f"
>>> (forces a pending job to run immediately on specified hosts)?
>>>
>>> Than you.
>>>
>>> Luis
>>
>
>
> -- 
> Sent from my Android phone with K-9 Mail. Please excuse my brevity.