Hi Trey, I was looking for an explanation because a user noticed us an error message during job submission and I saw the "not the right user" message into slurmctld.log linked to this job submission. After analyse I saw the two problems are not linked but I wanted to be clear on the "not the right user" message.
We have just migrated from slurm 2.6 to 14.11.9. I never saw this message with 2.6. As we just had an LDAP problem on our cluster we were vigilant on any error which can concern LDAP. So all is clear now and I don't have to worry about this message. Thanks a lot for your help. Best regards, Gerard Departement Calcul Intensif Centre Informatique National de l'Enseignement Superieur 950, rue de Saint Priest 34097 Montpellier CEDEX 5 FRANCE tel : (334) 67 14 14 14 fax : (334) 67 52 37 63 web : http://www.cines.fr > De: "Trey Dockendorf" <treyd...@tamu.edu> > À: "slurm-dev" <slurm-dev@schedmd.com> > Envoyé: Mercredi 21 Octobre 2015 01:12:42 > Objet: [slurm-dev] Re: Problem using slurm 14.11.9 : > Re: [slurm-dev] Re: Problem using slurm 14.11.9 : > Is it causing any issues? As far as I can tell from the source, that debug > statement is just emitted when SLURM loops through associations. "not the > right > user" is printed to debug before SLURM moves on to the next record. > https://github.com/SchedMD/slurm/blob/slurm-15.08/src/common/assoc_mgr.c#L226-L229 > - Trey > ============================= > Trey Dockendorf > Systems Analyst I > Texas A&M University > Academy for Advanced Telecommunications and Learning Technologies > Phone: (979)458-2396 > Email: treyd...@tamu.edu > Jabber: treyd...@tamu.edu > On Tue, Oct 20, 2015 at 12:34 PM, Anatoliy Kovalenko < > tolik.kovale...@gmail.com > > wrote: >> Hi guys. We have the same issue with slurm 15.08.1 and we use a ldap for all >> nodes too. >> 2015-10-20 17:00 GMT+03:00 < g...@cines.fr > : >>> Hi Trey, >>> That was my first idea, and that where I was looking for the issue but >>> without >>> any result. >>> All systems have the same informations, as all nodes are ldap clients. >>> Thanks for your answer. >>> Best regards, >>> Gerard >>> Departement Calcul Intensif >>> Centre Informatique National de l'Enseignement Superieur >>> 950, rue de Saint Priest >>> 34097 Montpellier CEDEX 5 >>> FRANCE >>>> De: "Trey Dockendorf" < treyd...@tamu.edu > >>>> À: "slurm-dev" < slurm-dev@schedmd.com > >>>> Cc: "Gérard Gil" < gerard....@cines.fr > >>>> Envoyé: Lundi 19 Octobre 2015 18:34:34 >>>> Objet: Re: [slurm-dev] Problem using slurm 14.11.9 : >>>> I've seen similar messages in slurmd logs when the primary GID of a user at >>>> submit time did not match their primary GID on the compute node due to >>>> login >>>> session of user existing during a change in their GID. The error likely >>>> results >>>> from UIDs not being consistent on all systems. >>>> - Trey >>>> ============================= >>>> Trey Dockendorf >>>> Systems Analyst I >>>> Texas A&M University >>>> Academy for Advanced Telecommunications and Learning Technologies >>>> Phone: (979)458-2396 >>>> Email: treyd...@tamu.edu >>>> Jabber: treyd...@tamu.edu >>>> On Mon, Oct 19, 2015 at 11:21 AM, < g...@cines.fr > wrote: >>>>> Hello, >>>>> is there someone who can explain such kind of message in slurmctld.log : >>>>> debug: not the right user 2279 != 1761 >>>>> Thanks, >>>>> Best regards, >>>>> Gerard Gil >>>>> Departement Calcul Intensif >>>>> Centre Informatique National de l'Enseignement Superieur >>>>> 950, rue de Saint Priest >>>>> 34097 Montpellier CEDEX 5 >>>>> FRANCE >>>>> tel : (334) 67 14 14 14 >>>>> fax : (334) 67 52 37 63 >>>>> web : http://www.cines.fr >>>>>> De: "gil" < g...@cines.fr > >>>>>> À: "slurm-dev" < slurm-dev@schedmd.com > >>>>>> Cc: "gil" < g...@cines.fr > >>>>>> Envoyé: Mercredi 7 Octobre 2015 09:42:31 >>>>>> Objet: Problem using --ntasks (slurm 14.11.9) >>>>>> Hello, >>>>>> we have just upgraded our configuration from SLURM 2.6.9 to SLURM >>>>>> 14.11.9. >>>>>> We are facing a new issue with jobs using --ntasks. >>>>>> The following variables SLURM_NTASKS , SLURM_NPROCS and >>>>>> SLURM_STEP_NUM_TASKS are >>>>>> set with wrong values when the job is submitted using sbatch command : >>>>>> slurm script exemple : >>>>>> #!/bin/bash >>>>>> #SBATCH --nodes=4 >>>>>> #SBATCH --ntasks=7 >>>>>> #SBATCH --ntasks-per-node=2 >>>>>> ... >>>>>> In a "normal" case slurm 2.6.9 we get : >>>>>> SLURM_NTASKS=7 >>>>>> SLURM_NPROCS=7 >>>>>> SLURM_STEP_NUM_TASKS=7 >>>>>> With slurm version 14.11.9 , when the job is submitted with sbatch >>>>>> command we >>>>>> get : >>>>>> SLURM_NTASKS=8 >>>>>> SLURM_NPROCS=8 >>>>>> SLURM_STEP_NUM_TASKS=8 >>>>>> With slurm version 14.11.9 , when the job is submitted with salloc >>>>>> command we >>>>>> get : >>>>>> SLURM_NTASKS=7 >>>>>> SLURM_NPROCS=7 >>>>>> SLURM_STEP_NUM_TASKS=7 >>>>>> The only way we found to workaround the problem is to set these tree >>>>>> variables >>>>>> "by hand" inside the slurm script as the first command before job steps : >>>>>> #!/bin/bash >>>>>> #SBATCH --nodes=4 >>>>>> #SBATCH --ntasks=7 >>>>>> #SBATCH --ntasks-per-node=2 >>>>>> ... >>>>>> SLURM_NTASKS=7 >>>>>> SLURM_NPROCS=7 >>>>>> SLURM_STEP_NUM_TASKS=7 >>>>>> Any idea about this problem ? >>>>>> How can we solve it ? >>>>>> Best Regards, >>>>>> Gerard Gil >>>>>> Departement Calcul Intensif >>>>>> Centre Informatique National de l'Enseignement Superieur >>>>>> 950, rue de Saint Priest >>>>>> 34097 Montpellier CEDEX 5 >>>>>> FRANCE >>>>>> tel : (334) 67 14 14 14 >>>>>> fax : (334) 67 52 37 63 >>>>>> web : http://www.cines.fr >> -- >> ¯\_(ツ)_/¯