Probaly not specific to 20.11.1, nor a Cray, but has anyone out there seen anything like this.
As the slurmctld restarts, after upping the debug level, it all look hunky dory, [2020-12-17T09:23:46.204] debug3: Trying to load plugin /opt/slurm/20.11.1/lib64/slurm/job_submit_cray_aries.so [2020-12-17T09:23:46.205] debug3: Success. [2020-12-17T09:23:46.206] debug3: Trying to load plugin /opt/slurm/20.11.1/lib64/slurm/job_submit_lua.so [2020-12-17T09:23:46.207] debug3: slurm_lua_loadscript: job_submit/lua: loading Lua script: /etc/opt/slurm/job_submit.lua [2020-12-17T09:23:46.208] debug3: Success. [2020-12-17T09:23:46.209] debug3: Trying to load plugin /opt/slurm/20.11.1/lib64/slurm/prep_script.so [2020-12-17T09:23:46.210] debug3: Success. but, at the point a submiited job that should pass through the job_submit script, [2020-12-17T09:26:06.806] debug3: job_submit/lua: slurm_lua_loadscript: skipping loading Lua script: /etc/opt/slurm/job_submit.lua [2020-12-17T09:26:06.807] debug3: assoc_mgr_fill_in_user: found correct user: someuser(12345) [2020-12-17T09:26:06.808] debug5: assoc_mgr_fill_in_assoc: looking for assoc of user=someuser(12345), acct=accnts0001, cluster=clust, partition=acceptance [2020-12-17T09:26:06.809] debug3: assoc_mgr_fill_in_assoc: found correct association of user=someuser(12345), acct=accnts0001, cluster=clust, partition=acceptance to assoc=67 acct=accnts0001 Reason I went looking is that the job_submit.lua should be telling me, the job submitter, to "sling my hook" as I have, deliberately, left something out. FWIW, the debug level here goes all the way to 5, so I was hoping for a little more info as to why it is skipping it. The skip is occuring, in src/lua/slurm_lua.c, because of this trap if (st.st_mtime <= *load_time) { debug3("%s: %s: skipping loading Lua script: %s", plugin, __func__, script_path); return SLURM_SUCCESS; } debug3("%s: %s: loading Lua script: %s", __func__, plugin, script_path); where "st" is a stat struct, but I am currently none the wiser as why such a condition would be (maybe even, would need to be) triggered? The job submit script is certainly "younger" than the time of the slurmctld restart, and of the job submission, be then, why wouldn't it be? Kevin -- Supercomputing Systems Administrator Pawsey Supercomputing Centre