[slurm-dev] Re: Job Submit Lua Plugin

2017-06-28 Thread Nathan Vance
Nicholas,

I can confirm that I get the same result as you, and I now realize my
mistake. I executed
sbatch ls_test -N 2
rather than
sbatch -N 2 ls_test

Which is a rather silly thing to get tripped up over.

Thanks,
Nathan

On 28 June 2017 at 16:26, Nicholas McCollum  wrote:

> Try this:
>
> [root@dmc197 ~]# sbatch ls_test
> sbatch: error:
> Job requested Min Nodes: 4294967294.
>
> sbatch: error: Batch job submission failed: Unspecified error
> [root@dmc197 ~]# sbatch -N 2 ls_test
> sbatch: error:
> Job requested Min Nodes: 2.
>
> sbatch: error: Batch job submission failed: Unspecified error
> [root@dmc197 ~]# cat /etc/slurm/job_submit.lua
> function slurm_job_modify(job_desc, part_list, submit_uid)
> end
>
> function slurm_job_submit(job_desc, part_list, submit_uid)
>   local test_min_nodes = job_desc.min_nodes
>   error_verbose = string.format("Job requested Min Nodes: %s.\n",
> test_min_nodes)
>   slurm.log_user("\n%s", error_verbose)
>   return slurm.ERROR
> end
>
>
> --
> Nicholas McCollum
> HPC Systems Administrator
> Alabama Supercomputer Authority
>
>
>
> On Wed, 2017-06-28 at 13:51 -0600, Nathan Vance wrote:
> > Correction (copy/pasted wrong thing): It was the
> > "JobSubmitPlugins=lua" line in slurm.conf, not "job_submit.lua:
> > initialized", that did the trick.
> >
> > At least, I thought that was the end of the story. Now I'm getting
> > odd errors with reading job_desc and part_list that behave, in my
> > estimate, like lua's receiving a bad pointer to the underlying c data
> > structure.
> >
> > On ubuntu, the unedited job_submit.lua provided with the sample code
> > runs without crashing, though it does not respect the --
> > partition="foo" flag in sbatch as the source code suggests it should.
> > When edited to include slurm.log_info("bar"), the script crashes
> > with:
> > /etc/slurm/job_submit.lua:38: attempt to compare number with nil
> > The fact that behaviour changes based on the presence of unrelated
> > code makes me think that this is a pointer issue, but I don't know
> > enough about the compilation of lua to bytecode to diagnose it.
> >
> > On centos, with or without the log command, it crashes at the same
> > point as on ubuntu.
> >
> > On both:
> > When I comment out the example code so that it doesn't crash, then
> > try to print out values in job_desc, I get some really odd results.
> > For example, job_desc.min_nodes is 4294967294 (on both systems),
> > regardless of what I set with sbatch job.sh --nodes=X. At first I
> > thought that slurm gave my lua script a bad pointer to something that
> > had already been garbage collected, but then I discovered that if I
> > hard code something in lua such as job_desc.min_nodes=X, then slurm
> > assigns X nodes to the job. So perhaps slurm respects what lua
> > populates job_desc with, but slurm initially fills it with arbitrary
> > values?
> >
> > Here's the lua script I used for the above experiments:
> >  BEGIN job_submit.lua 
> > function slurm_job_submit(job_desc, part_list, submit_uid)
> > slurm.log_info(job_desc.min_nodes)
> > job_desc.min_nodes=5
> > return slurm.SUCCESS
> > end
> >
> > function slurm_job_modify(job_desc, job_rec, part_list, modify_uid)
> > return slurm.SUCCESS
> > end
> >
> > slurm.log_info("initialized")
> > return slurm.SUCCESS
> >  END job_submit.lua 
> >
> > As an aside, it looks like job_desc uses job_descriptor under the
> > hood:
> > https://github.com/SchedMD/slurm/blob/master/slurm/slurm.h.in#L1373-L
> > 1553
> > As I wasn't positive, I experimented first using job_desc.qos, which
> > Nicholas indicated should be supported, but while it exhibited
> > similar behaviour to min_nodes, it didn't fail quite as
> > spectacularly.
> > I couldn't figure out what structure backs part_list. The
> > documentation at https://slurm.schedmd.com/job_submit_plugins.html
> > isn't clear when all it says is that it's a "List of pointer to
> > partitions which this user is authorized to use." [sic]
> >
> > I'm still using slurm 17.02.5. On ubuntu I'm using lua5.2, and on
> > centos it's lua5.1. In both cases, lua (both the interpreter and the
> > dev libraries) were installed from the repositories, and slurm was
> > built from source.
> >
> > It seems like I filled an email with a whole lot of complaints and no
> > real questions. So, is this a configuration error on my end? Should I
> > suck it up and write my plugin in c, even though I don't need full
> > access to slurmctld? Should I switch to using slurm-wlm? Should I
> > open a bug report?
> >
> > Thanks,
> > Nathan
> >
> > On 27 June 2017 at 17:07, Nicholas McCollum 
> > wrote:
> > > Nathan,
> > >
> > > I have very much appreciated the job_submit.lua plugin for helping
> > > educate users on what is an acceptable job.  It is one of my
> > > favorite
> > > features about SLURM and has been invaluable in assisting students
> > > in
> > > submitting valid job requirements.
> > 

[slurm-dev] Re: Job Submit Lua Plugin

2017-06-28 Thread Nicholas McCollum
Try this:

[root@dmc197 ~]# sbatch ls_test 
sbatch: error: 
Job requested Min Nodes: 4294967294.

sbatch: error: Batch job submission failed: Unspecified error
[root@dmc197 ~]# sbatch -N 2 ls_test 
sbatch: error: 
Job requested Min Nodes: 2.

sbatch: error: Batch job submission failed: Unspecified error
[root@dmc197 ~]# cat /etc/slurm/job_submit.lua
function slurm_job_modify(job_desc, part_list, submit_uid)
end

function slurm_job_submit(job_desc, part_list, submit_uid)
  local test_min_nodes = job_desc.min_nodes
  error_verbose = string.format("Job requested Min Nodes: %s.\n",
test_min_nodes)
  slurm.log_user("\n%s", error_verbose)
  return slurm.ERROR
end


-- 
Nicholas McCollum
HPC Systems Administrator
Alabama Supercomputer Authority



On Wed, 2017-06-28 at 13:51 -0600, Nathan Vance wrote:
> Correction (copy/pasted wrong thing): It was the
> "JobSubmitPlugins=lua" line in slurm.conf, not "job_submit.lua:
> initialized", that did the trick.
> 
> At least, I thought that was the end of the story. Now I'm getting
> odd errors with reading job_desc and part_list that behave, in my
> estimate, like lua's receiving a bad pointer to the underlying c data
> structure.
> 
> On ubuntu, the unedited job_submit.lua provided with the sample code
> runs without crashing, though it does not respect the --
> partition="foo" flag in sbatch as the source code suggests it should.
> When edited to include slurm.log_info("bar"), the script crashes
> with:
> /etc/slurm/job_submit.lua:38: attempt to compare number with nil
> The fact that behaviour changes based on the presence of unrelated
> code makes me think that this is a pointer issue, but I don't know
> enough about the compilation of lua to bytecode to diagnose it.
> 
> On centos, with or without the log command, it crashes at the same
> point as on ubuntu.
> 
> On both:
> When I comment out the example code so that it doesn't crash, then
> try to print out values in job_desc, I get some really odd results.
> For example, job_desc.min_nodes is 4294967294 (on both systems),
> regardless of what I set with sbatch job.sh --nodes=X. At first I
> thought that slurm gave my lua script a bad pointer to something that
> had already been garbage collected, but then I discovered that if I
> hard code something in lua such as job_desc.min_nodes=X, then slurm
> assigns X nodes to the job. So perhaps slurm respects what lua
> populates job_desc with, but slurm initially fills it with arbitrary
> values?
> 
> Here's the lua script I used for the above experiments:
>  BEGIN job_submit.lua 
> function slurm_job_submit(job_desc, part_list, submit_uid)
> slurm.log_info(job_desc.min_nodes)
> job_desc.min_nodes=5
> return slurm.SUCCESS
> end
> 
> function slurm_job_modify(job_desc, job_rec, part_list, modify_uid)
> return slurm.SUCCESS
> end
> 
> slurm.log_info("initialized")
> return slurm.SUCCESS
>  END job_submit.lua 
> 
> As an aside, it looks like job_desc uses job_descriptor under the
> hood:
> https://github.com/SchedMD/slurm/blob/master/slurm/slurm.h.in#L1373-L
> 1553
> As I wasn't positive, I experimented first using job_desc.qos, which
> Nicholas indicated should be supported, but while it exhibited
> similar behaviour to min_nodes, it didn't fail quite as
> spectacularly.
> I couldn't figure out what structure backs part_list. The
> documentation at https://slurm.schedmd.com/job_submit_plugins.html
> isn't clear when all it says is that it's a "List of pointer to
> partitions which this user is authorized to use." [sic]
> 
> I'm still using slurm 17.02.5. On ubuntu I'm using lua5.2, and on
> centos it's lua5.1. In both cases, lua (both the interpreter and the
> dev libraries) were installed from the repositories, and slurm was
> built from source.
> 
> It seems like I filled an email with a whole lot of complaints and no
> real questions. So, is this a configuration error on my end? Should I
> suck it up and write my plugin in c, even though I don't need full
> access to slurmctld? Should I switch to using slurm-wlm? Should I
> open a bug report?
> 
> Thanks,
> Nathan
> 
> On 27 June 2017 at 17:07, Nicholas McCollum 
> wrote:
> > Nathan,
> > 
> > I have very much appreciated the job_submit.lua plugin for helping
> > educate users on what is an acceptable job.  It is one of my
> > favorite
> > features about SLURM and has been invaluable in assisting students
> > in
> > submitting valid job requirements.
> > 
> > If a user specifies some absurd amount of memory, or some other
> > sbatch
> > or srun parameter... or does not choose a parameter, I like to
> > notify
> > the user what they have done wrong.  For example I require all
> > users to
> > specify a QoS when they submit a job.
> > 
> > == BEGIN EXAMPLE job_submit.lua ==
> > 
> > function slurm_job_modify(job_desc, part_list, submit_uid)
> > end
> > 
> > function slurm_job_submit(job_desc, part_list, submit_uid)
> > 
> > --[[ Start with an 

[slurm-dev] Re: Job Submit Lua Plugin

2017-06-28 Thread Nathan Vance
Correction (copy/pasted wrong thing): It was the "JobSubmitPlugins=lua"
line in slurm.conf, not "job_submit.lua: initialized", that did the trick.

At least, I thought that was the end of the story. Now I'm getting odd
errors with reading job_desc and part_list that behave, in my estimate,
like lua's receiving a bad pointer to the underlying c data structure.

On ubuntu, the unedited job_submit.lua provided with the sample code runs
without crashing, though it does not respect the --partition="foo" flag in
sbatch as the source code suggests it should. When edited to include
slurm.log_info("bar"), the script crashes with:
/etc/slurm/job_submit.lua:38: attempt to compare number with nil
The fact that behaviour changes based on the presence of unrelated code
makes me think that this is a pointer issue, but I don't know enough about
the compilation of lua to bytecode to diagnose it.

On centos, with or without the log command, it crashes at the same point as
on ubuntu.

On both:
When I comment out the example code so that it doesn't crash, then try to
print out values in job_desc, I get some really odd results. For example,
job_desc.min_nodes is 4294967294 (on both systems), regardless of what I
set with sbatch job.sh --nodes=X. At first I thought that slurm gave my lua
script a bad pointer to something that had already been garbage collected,
but then I discovered that if I hard code something in lua such as
job_desc.min_nodes=X, then slurm assigns X nodes to the job. So perhaps
slurm respects what lua populates job_desc with, but slurm initially fills
it with arbitrary values?

Here's the lua script I used for the above experiments:
 BEGIN job_submit.lua 
function slurm_job_submit(job_desc, part_list, submit_uid)
slurm.log_info(job_desc.min_nodes)
job_desc.min_nodes=5
return slurm.SUCCESS
end

function slurm_job_modify(job_desc, job_rec, part_list, modify_uid)
return slurm.SUCCESS
end

slurm.log_info("initialized")
return slurm.SUCCESS
 END job_submit.lua 

As an aside, it looks like job_desc uses job_descriptor under the hood:
https://github.com/SchedMD/slurm/blob/master/slurm/slurm.h.in#L1373-L1553
As I wasn't positive, I experimented first using job_desc.qos, which
Nicholas indicated should be supported, but while it exhibited similar
behaviour to min_nodes, it didn't fail quite as spectacularly.
I couldn't figure out what structure backs part_list. The documentation at
https://slurm.schedmd.com/job_submit_plugins.html isn't clear when all it
says is that it's a "List of pointer to partitions which this user is
authorized to use." [sic]

I'm still using slurm 17.02.5. On ubuntu I'm using lua5.2, and on centos
it's lua5.1. In both cases, lua (both the interpreter and the dev
libraries) were installed from the repositories, and slurm was built from
source.

It seems like I filled an email with a whole lot of complaints and no real
questions. So, is this a configuration error on my end? Should I suck it up
and write my plugin in c, even though I don't need full access to
slurmctld? Should I switch to using slurm-wlm? Should I open a bug report?

Thanks,
Nathan

On 27 June 2017 at 17:07, Nicholas McCollum  wrote:

> Nathan,
>
> I have very much appreciated the job_submit.lua plugin for helping
> educate users on what is an acceptable job.  It is one of my favorite
> features about SLURM and has been invaluable in assisting students in
> submitting valid job requirements.
>
> If a user specifies some absurd amount of memory, or some other sbatch
> or srun parameter... or does not choose a parameter, I like to notify
> the user what they have done wrong.  For example I require all users to
> specify a QoS when they submit a job.
>
> == BEGIN EXAMPLE job_submit.lua ==
>
> function slurm_job_modify(job_desc, part_list, submit_uid)
> end
>
> function slurm_job_submit(job_desc, part_list, submit_uid)
>
> --[[ Start with an error count of 0 ]]--
>   local asc_error = 0
>   local asc_error_verbose = ""
>
>   --[[ Pretend if statement ]]--
> asc_error = asc_error + 1
> asc_error_verbose = string.format("%s\nERROR: Job requested
> something we dont like.\n", asc_error_verbose)
>   --[[ End Pretend if statement ]]--
>
>   --[[ Pretend if statement ]]--
> asc_error = asc_error + 1
> asc_error_verbose = string.format("%s\nERROR: More bad stuff.\n",
> asc_error_verbose)
>   --[[ End Pretend if statement ]]--
>
>   if asc_error > 0 then
> slurm.log_user("\n%s", asc_error_verbose)
> return slurm.ERROR
>   end
>
>   --[[ Want to return slurm.SUCCESS if the entire script runs to end
> ]]--
>   return slurm.SUCCESS
> end
>
> == END EXAMPLE job_submit.lua ===
>
> This is the method that I worked out, where it collects all of the
> errors inside asc_error_verbose and dumps out at the end with return
> slurm.ERROR.   If you use the current file above, it will return every
> job with those errors above.  This would be a 

[slurm-dev] Re: Job Submit Lua Plugin

2017-06-27 Thread Nicholas McCollum
Nathan,

I have very much appreciated the job_submit.lua plugin for helping
educate users on what is an acceptable job.  It is one of my favorite
features about SLURM and has been invaluable in assisting students in
submitting valid job requirements.  

If a user specifies some absurd amount of memory, or some other sbatch
or srun parameter... or does not choose a parameter, I like to notify
the user what they have done wrong.  For example I require all users to
specify a QoS when they submit a job.  

== BEGIN EXAMPLE job_submit.lua ==

function slurm_job_modify(job_desc, part_list, submit_uid)
end

function slurm_job_submit(job_desc, part_list, submit_uid)

--[[ Start with an error count of 0 ]]--
  local asc_error = 0
  local asc_error_verbose = ""

  --[[ Pretend if statement ]]--
asc_error = asc_error + 1
asc_error_verbose = string.format("%s\nERROR: Job requested
something we dont like.\n", asc_error_verbose)
  --[[ End Pretend if statement ]]--

  --[[ Pretend if statement ]]--
asc_error = asc_error + 1
asc_error_verbose = string.format("%s\nERROR: More bad stuff.\n",
asc_error_verbose)
  --[[ End Pretend if statement ]]--
  
  if asc_error > 0 then
slurm.log_user("\n%s", asc_error_verbose)
return slurm.ERROR
  end  

  --[[ Want to return slurm.SUCCESS if the entire script runs to end
]]--
  return slurm.SUCCESS
end

== END EXAMPLE job_submit.lua ===

This is the method that I worked out, where it collects all of the
errors inside asc_error_verbose and dumps out at the end with return
slurm.ERROR.   If you use the current file above, it will return every
job with those errors above.  This would be a great way to check that
job_submit.lua is working on your system.  If you have any current jobs
though, it will kill them all... so use this on a development
environment for testing.

My example for making a user specify a QoS:

  local asc_qos = job_desc.qos
  if asc_qos == nil then
asc_error = asc_error + 1
asc_error_verbose = string.format("%s\nJob must request a QoS using
the --qos= flag.\n",asc_error_verbose)
asc_qos = "invalid"
  end


I'd be more than happy to share my job_submit.lua if anyone is
interested.  I only ask that you share yours back.

-- 
Nicholas McCollum
HPC Systems Administrator
Alabama Supercomputer Authority

On Tue, 2017-06-27 at 14:30 -0600, Nathan Vance wrote:
> Darby,
> 
> The "job_submit.lua: initialized" line in slurm.conf was indeed the
> issue. When compiling slurm I only got the "yes lua" line without the
> flags, but that seems to be just a difference in OS's.
> 
> Now that I have debugging feedback I should be good to go!
> 
> Thanks,
> Nathan
> 
> On 27 June 2017 at 16:13, Vicker, Darby (JSC-EG311)  asa.gov> wrote:
> > We recently started using a lua job submit plugin as well.  You
> > have to have the lua-devel package installed when you compile
> > slurm.  It looks like you do (but we use RHEL the package name is
> > lua-devel) but confirm that you see something like these in
> > config.log:
> >  
> > configure:24784: result: yes lua
> > pkg_cv_lua_LIBS='-llua -lm -ldl  '
> > lua_CFLAGS='  -DLUA_COMPAT_ALL'
> > lua_LIBS='-llua -lm -ldl  '
> >  
> > Do you have this in your slurm.conf?
> >  
> > JobSubmitPlugins=lua
> >  
> > I'm guessing not given you don't see anything in the logs. Before I
> > got all the errors worked out, I would see errors like this in
> > slurmctld_log:
> >  
> > error: Couldn't find the specified plugin name for job_submit/lua
> > looking at all files
> > error: cannot find job_submit plugin for job_submit/lua
> > error: cannot create job_submit context for job_submit/lua
> > failed to initialize job_submit plugin
> >  
> >  
> > After getting everything working, you should see this:
> >  
> > job_submit.lua: initialized
> >  
> > As well as any other slurm.log_info messages you put in your lua
> > script. 
> >  
> >  
> > From: Nathan Vance 
> > Reply-To: slurm-dev 
> > Date: Tuesday, June 27, 2017 at 12:15 PM
> > To: slurm-dev 
> > Subject: [slurm-dev] Job Submit Lua Plugin
> >  
> > Hello all!
> > 
> > I've been working on getting off the ground with Lua plugins. The
> > goal is to implement Torque's routing queues for SLURM, but so far
> > I have been unable to get SLURM to even call my plugin.
> > 
> > What I have tried:
> > 1) Copied contrib/lua/job_submit.lua to /etc/slurm/ (the same
> > directory as slurm.conf)
> > 2) Restarted slurmctld and verified that no functionality was
> > broken
> > 3) Added slurm.log_info("I got here") to several points in the
> > script. After restarting slurmctld and submitting a job, grep "I
> > got here" -R /var/log found no results.
> > 4) In case there was a problem with the log file, I added
> > os.execute("touch /home/myUser/slurm_job_submitted") to the top of
> > the slurm_job_submit method. Restarting slurmctld and submitting a
> > job still produced no 

[slurm-dev] Re: Job Submit Lua Plugin

2017-06-27 Thread Ryan Cox

Nathan and Darby,

For you and anyone else using Lua, see 
https://bugs.schedmd.com/show_bug.cgi?id=3815 with regards to --mem vs 
--mem-per-cpu starting in 17.02.


Ryan

On 06/27/2017 02:30 PM, Nathan Vance wrote:

Re: [slurm-dev] Re: Job Submit Lua Plugin
Darby,

The "job_submit.lua: initialized" line in slurm.conf was indeed the 
issue. When compiling slurm I only got the "yes lua" line without the 
flags, but that seems to be just a difference in OS's.


Now that I have debugging feedback I should be good to go!

Thanks,
Nathan

On 27 June 2017 at 16:13, Vicker, Darby (JSC-EG311) 
<darby.vicke...@nasa.gov <mailto:darby.vicke...@nasa.gov>> wrote:


We recently started using a lua job submit plugin as well.  You
have to have the lua-devel package installed when you compile
slurm. It looks like you do (but we use RHEL the package name is
lua-devel) but confirm that you see something like these in
config.log:

configure:24784: result: yes lua

pkg_cv_lua_LIBS='-llua -lm -ldl '

lua_CFLAGS='  -DLUA_COMPAT_ALL'

lua_LIBS='-llua -lm -ldl  '

Do you have this in your slurm.conf?

JobSubmitPlugins=lua

I'm guessing not given you don't see anything in the logs. Before
I got all the errors worked out, I would see errors like this in
slurmctld_log:

error: Couldn't find the specified plugin name for job_submit/lua
looking at all files

error: cannot find job_submit plugin for job_submit/lua

error: cannot create job_submit context for job_submit/lua

failed to initialize job_submit plugin

After getting everything working, you should see this:

job_submit.lua: initialized

As well as any other slurm.log_info messages you put in your lua
script.

*From: *Nathan Vance <naterva...@gmail.com
<mailto:naterva...@gmail.com>>
*Reply-To: *slurm-dev <slurm-dev@schedmd.com
<mailto:slurm-dev@schedmd.com>>
*Date: *Tuesday, June 27, 2017 at 12:15 PM
*To: *slurm-dev <slurm-dev@schedmd.com <mailto:slurm-dev@schedmd.com>>
*Subject: *[slurm-dev] Job Submit Lua Plugin

Hello all!

I've been working on getting off the ground with Lua plugins. The
goal is to implement Torque's routing queues for SLURM, but so far
I have been unable to get SLURM to even call my plugin.

What I have tried:

1) Copied contrib/lua/job_submit.lua to /etc/slurm/ (the same
directory as slurm.conf)

2) Restarted slurmctld and verified that no functionality was broken

3) Added slurm.log_info("I got here") to several points in the
script. After restarting slurmctld and submitting a job, grep "I
got here" -R /var/log found no results.

4) In case there was a problem with the log file, I added
os.execute("touch /home/myUser/slurm_job_submitted") to the top of
the slurm_job_submit method. Restarting slurmctld and submitting a
job still produced no evidence that my plugin was called.

5) In case there were permission issues, I made job_submit.lua
executable. Nothing. Even grep "job_submit" -R /var/log (in case
there was an error calling the script) comes up dry.

Relevant information:

OS: Ubuntu 16.04

Lua: lua5.2 and liblua5.2-dev (I can use Lua interactively)

SLURM version: 17.02.5, compiled from source (after installing
Lua) using ./configure --prefix=/usr --sysconfdir=/etc/slurm

Any guidance to get me up and running would be greatly appreciated!

Thanks,

Nathan




--
Ryan Cox
Operations Director
Fulton Supercomputing Lab
Brigham Young University



[slurm-dev] Re: Job Submit Lua Plugin

2017-06-27 Thread Nathan Vance
Darby,

The "job_submit.lua: initialized" line in slurm.conf was indeed the issue.
When compiling slurm I only got the "yes lua" line without the flags, but
that seems to be just a difference in OS's.

Now that I have debugging feedback I should be good to go!

Thanks,
Nathan

On 27 June 2017 at 16:13, Vicker, Darby (JSC-EG311)  wrote:

> We recently started using a lua job submit plugin as well.  You have to
> have the lua-devel package installed when you compile slurm.  It looks like
> you do (but we use RHEL the package name is lua-devel) but confirm that you
> see something like these in config.log:
>
>
>
> configure:24784: result: yes lua
>
> pkg_cv_lua_LIBS='-llua -lm -ldl  '
>
> lua_CFLAGS='  -DLUA_COMPAT_ALL'
>
> lua_LIBS='-llua -lm -ldl  '
>
>
>
> Do you have this in your slurm.conf?
>
>
>
> JobSubmitPlugins=lua
>
>
>
> I'm guessing not given you don't see anything in the logs. Before I got
> all the errors worked out, I would see errors like this in slurmctld_log:
>
>
>
> error: Couldn't find the specified plugin name for job_submit/lua looking
> at all files
>
> error: cannot find job_submit plugin for job_submit/lua
>
> error: cannot create job_submit context for job_submit/lua
>
> failed to initialize job_submit plugin
>
>
>
>
>
> After getting everything working, you should see this:
>
>
>
> job_submit.lua: initialized
>
>
>
> As well as any other slurm.log_info messages you put in your lua script.
>
>
>
>
>
> *From: *Nathan Vance 
> *Reply-To: *slurm-dev 
> *Date: *Tuesday, June 27, 2017 at 12:15 PM
> *To: *slurm-dev 
> *Subject: *[slurm-dev] Job Submit Lua Plugin
>
>
>
> Hello all!
>
> I've been working on getting off the ground with Lua plugins. The goal is
> to implement Torque's routing queues for SLURM, but so far I have been
> unable to get SLURM to even call my plugin.
>
> What I have tried:
>
> 1) Copied contrib/lua/job_submit.lua to /etc/slurm/ (the same directory as
> slurm.conf)
>
> 2) Restarted slurmctld and verified that no functionality was broken
>
> 3) Added slurm.log_info("I got here") to several points in the script.
> After restarting slurmctld and submitting a job, grep "I got here" -R
> /var/log found no results.
>
> 4) In case there was a problem with the log file, I added
> os.execute("touch /home/myUser/slurm_job_submitted") to the top of the
> slurm_job_submit method. Restarting slurmctld and submitting a job still
> produced no evidence that my plugin was called.
>
> 5) In case there were permission issues, I made job_submit.lua executable.
> Nothing. Even grep "job_submit" -R /var/log (in case there was an error
> calling the script) comes up dry.
>
> Relevant information:
>
> OS: Ubuntu 16.04
>
> Lua: lua5.2 and liblua5.2-dev (I can use Lua interactively)
>
> SLURM version: 17.02.5, compiled from source (after installing Lua) using
> ./configure --prefix=/usr --sysconfdir=/etc/slurm
>
> Any guidance to get me up and running would be greatly appreciated!
>
>
>
> Thanks,
>
> Nathan
>


[slurm-dev] Re: Job Submit Lua Plugin

2017-06-27 Thread Vicker, Darby (JSC-EG311)
We recently started using a lua job submit plugin as well.  You have to have 
the lua-devel package installed when you compile slurm.  It looks like you do 
(but we use RHEL the package name is lua-devel) but confirm that you see 
something like these in config.log:

configure:24784: result: yes lua
pkg_cv_lua_LIBS='-llua -lm -ldl  '
lua_CFLAGS='  -DLUA_COMPAT_ALL'
lua_LIBS='-llua -lm -ldl  '

Do you have this in your slurm.conf?

JobSubmitPlugins=lua

I'm guessing not given you don't see anything in the logs. Before I got all the 
errors worked out, I would see errors like this in slurmctld_log:

error: Couldn't find the specified plugin name for job_submit/lua looking at 
all files
error: cannot find job_submit plugin for job_submit/lua
error: cannot create job_submit context for job_submit/lua
failed to initialize job_submit plugin


After getting everything working, you should see this:

job_submit.lua: initialized

As well as any other slurm.log_info messages you put in your lua script.


From: Nathan Vance 
Reply-To: slurm-dev 
Date: Tuesday, June 27, 2017 at 12:15 PM
To: slurm-dev 
Subject: [slurm-dev] Job Submit Lua Plugin

Hello all!
I've been working on getting off the ground with Lua plugins. The goal is to 
implement Torque's routing queues for SLURM, but so far I have been unable to 
get SLURM to even call my plugin.
What I have tried:
1) Copied contrib/lua/job_submit.lua to /etc/slurm/ (the same directory as 
slurm.conf)
2) Restarted slurmctld and verified that no functionality was broken
3) Added slurm.log_info("I got here") to several points in the script. After 
restarting slurmctld and submitting a job, grep "I got here" -R /var/log found 
no results.
4) In case there was a problem with the log file, I added os.execute("touch 
/home/myUser/slurm_job_submitted") to the top of the slurm_job_submit method. 
Restarting slurmctld and submitting a job still produced no evidence that my 
plugin was called.
5) In case there were permission issues, I made job_submit.lua executable. 
Nothing. Even grep "job_submit" -R /var/log (in case there was an error calling 
the script) comes up dry.
Relevant information:
OS: Ubuntu 16.04
Lua: lua5.2 and liblua5.2-dev (I can use Lua interactively)
SLURM version: 17.02.5, compiled from source (after installing Lua) using 
./configure --prefix=/usr --sysconfdir=/etc/slurm
Any guidance to get me up and running would be greatly appreciated!

Thanks,
Nathan