Hi Mahmood, Your running job is requesting 6 CPUs per node (4 nodes, 6 CPUs per node). That means 6 CPUs are being used on node hpc.
Your queued job is requesting 5 CPUs per node (4 nodes, 5 CPUs per node). In total, if it was running, that would require 11 CPUs on node hpc. But hpc only has 10 cores, so it can't run. Sean On Tue, 17 Dec 2019 at 20:03, Mahmood Naderan <mahmood...@gmail.com<mailto:mahmood...@gmail.com>> wrote: Please see the latest update # for i in {0..2}; do scontrol show node compute-0-$i | grep RealMemory; done && scontrol show node hpc | grep RealMemory RealMemory=64259 AllocMem=1024 FreeMem=57163 Sockets=32 Boards=1 RealMemory=120705 AllocMem=1024 FreeMem=97287 Sockets=32 Boards=1 RealMemory=64259 AllocMem=1024 FreeMem=40045 Sockets=32 Boards=1 RealMemory=64259 AllocMem=1024 FreeMem=24154 Sockets=10 Boards=1 $ sbatch slurm_qe.sh Submitted batch job 125 $ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 125 SEA qe-fb mahmood PD 0:00 4 (Resources) 124 SEA U1phi1 abspou R 3:52 4 compute-0-[0-2],hpc $ scontrol show -d job 125 JobId=125 JobName=qe-fb UserId=mahmood(1000) GroupId=mahmood(1000) MCS_label=N/A Priority=1751 Nice=0 Account=fish QOS=normal WCKey=*default JobState=PENDING Reason=Resources Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 DerivedExitCode=0:0 RunTime=00:00:00 TimeLimit=30-00:00:00 TimeMin=N/A SubmitTime=2019-12-17T12:29:08 EligibleTime=2019-12-17T12:29:08 AccrueTime=2019-12-17T12:29:08 StartTime=Unknown EndTime=Unknown Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2019-12-17T12:29:09 Partition=SEA AllocNode:Sid=hpc.scu.ac.ir:22742<http://hpc.scu.ac.ir:22742> ReqNodeList=(null) ExcNodeList=(null) NodeList=(null) NumNodes=4-4 NumCPUs=20 NumTasks=20 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=20,mem=40G,node=4,billing=20 Socks/Node=* NtasksPerN:B:S:C=5:0:*:* CoreSpec=* MinCPUsNode=5 MinMemoryNode=10G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/home/mahmood/qe/f_borophene/slurm_qe.sh WorkDir=/home/mahmood/qe/f_borophene StdErr=/home/mahmood/qe/f_borophene/my_fb.log StdIn=/dev/null StdOut=/home/mahmood/qe/f_borophene/my_fb.log Power= $ cat slurm_qe.sh #!/bin/bash #SBATCH --job-name=qe-fb #SBATCH --output=my_fb.log #SBATCH --partition=SEA #SBATCH --account=fish #SBATCH --mem=10GB #SBATCH --nodes=4 #SBATCH --ntasks-per-node=5 mpirun -np $SLURM_NTASKS /share/apps/q-e-qe-6.5/bin/pw.x -in f_borophene_scf.in<http://f_borophene_scf.in> You can also see the job detail of 124 $ scontrol show -d job 124 JobId=124 JobName=U1phi1 UserId= abspou(1002) GroupId= abspou(1002) MCS_label=N/A Priority=958 Nice=0 Account=fish QOS=normal WCKey=*default JobState=RUNNING Reason=None Dependency=(null) Requeue=1 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:0 DerivedExitCode=0:0 RunTime=00:06:17 TimeLimit=30-00:00:00 TimeMin=N/A SubmitTime=2019-12-17T12:25:17 EligibleTime=2019-12-17T12:25:17 AccrueTime=2019-12-17T12:25:17 StartTime=2019-12-17T12:25:17 EndTime=2020-01-16T12:25:17 Deadline=N/A SuspendTime=None SecsPreSuspend=0 LastSchedEval=2019-12-17T12:25:17 Partition=SEA AllocNode:Sid=hpc.scu.ac.ir:20085<http://hpc.scu.ac.ir:20085> ReqNodeList=(null) ExcNodeList=(null) NodeList=compute-0-[0-2],hpc BatchHost=compute-0-0 NumNodes=4 NumCPUs=24 NumTasks=24 CPUs/Task=1 ReqB:S:C:T=0:0:*:* TRES=cpu=24,mem=4G,node=4,billing=24 Socks/Node=* NtasksPerN:B:S:C=6:0:*:* CoreSpec=* Nodes=compute-0-[0-2],hpc CPU_IDs=0-5 Mem=1024 GRES= MinCPUsNode=6 MinMemoryNode=1G MinTmpDiskNode=0 Features=(null) DelayBoot=00:00:00 OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null) Command=/home/abspou/OpenFOAM/abbaspour-6/run/laminarSMOKEPhi1U1/slurm_script.sh WorkDir=/home/abspou/OpenFOAM/abbaspour-6/run/laminarSMOKEPhi1U1 StdErr=/home/abspou/OpenFOAM/abbaspour-6/run/laminarSMOKEPhi1U1/alpha3.45U1phi1lamSmoke.log StdIn=/dev/null StdOut=/home/abspou/OpenFOAM/abbaspour-6/run/laminarSMOKEPhi1U1/alpha3.45U1phi1lamSmoke.log Power= I can not figure out what is the root of the problem. Regards, Mahmood On Tue, Dec 17, 2019 at 11:18 AM Marcus Wagner <wag...@itc.rwth-aachen.de<mailto:wag...@itc.rwth-aachen.de>> wrote: Dear Mahmood, could you please show the output of scontrol show -d job 119 Best Marcus