Hi Ian, Check in the directory from which you issued the sbatch, and I believe you'll find a file called slurm-74.out, which will contain both stdout and stderr from your job (assuming that directory is available on all the nodes in your allocation).
Regards, Lyn On Sat, Jan 4, 2014 at 5:10 PM, ian <ian.h.0...@gmail.com> wrote: > Hello, I have problem with sbatch. I run the following command. Why > can't I see results of control node and compute nodes at standard output? > Thank you. > > miao@SLURM0:~$ cat my.script > #!/bin/sh > #SBATCH --time=30 > /bin/hostname > srun -l /bin/hostname > srun -l /bin/pwd > miao@SLURM0:~$ sbatch –n2 my.script > Submitted batch job 74 > miao@SLURM0:~$ > > miao@SLURM0:~$ scontrol show config > Configuration data as of 2014-01-04T18:47:22 > AccountingStorageBackupHost = (null) > AccountingStorageEnforce = none > AccountingStorageHost = localhost > AccountingStorageLoc = /var/log/slurm_jobacct.log > AccountingStoragePort = 0 > AccountingStorageType = accounting_storage/none > AccountingStorageUser = root > AccountingStoreJobComment = YES > AuthType = auth/munge > BackupAddr = (null) > BackupController = (null) > BatchStartTimeout = 10 sec > BOOT_TIME = 2014-01-04T16:53:52 > CacheGroups = 0 > CheckpointType = checkpoint/none > ClusterName = slurm > CompleteWait = 0 sec > ControlAddr = SLURM0 > ControlMachine = SLURM0 > CryptoType = crypto/munge > DebugFlags = (null) > DefMemPerNode = UNLIMITED > DisableRootJobs = NO > EnforcePartLimits = NO > Epilog = (null) > EpilogMsgTime = 2000 usec > EpilogSlurmctld = (null) > FastSchedule = 1 > FirstJobId = 1 > GetEnvTimeout = 2 sec > GresTypes = (null) > GroupUpdateForce = 0 > GroupUpdateTime = 600 sec > HASH_VAL = Match > HealthCheckInterval = 0 sec > HealthCheckProgram = (null) > InactiveLimit = 0 sec > JobAcctGatherFrequency = 30 sec > JobAcctGatherType = jobacct_gather/none > JobCheckpointDir = /var/slurm/checkpoint > JobCompHost = localhost > JobCompLoc = /var/log/slurm_jobcomp.log > JobCompPort = 0 > JobCompType = jobcomp/none > JobCompUser = root > JobCredentialPrivateKey = (null) > JobCredentialPublicCertificate = (null) > JobFileAppend = 0 > JobRequeue = 1 > JobSubmitPlugins = (null) > KillOnBadExit = 0 > KillWait = 30 sec > Licenses = (null) > MailProg = /usr/bin/mail > MaxJobCount = 10000 > MaxJobId = 4294901760 > MaxMemPerNode = UNLIMITED > MaxStepCount = 40000 > MaxTasksPerNode = 128 > MessageTimeout = 10 sec > MinJobAge = 300 sec > MpiDefault = none > MpiParams = (null) > NEXT_JOB_ID = 79 > OverTimeLimit = 0 min > PluginDir = /usr/lib/slurm > PlugStackConfig = /etc/slurm-llnl/plugstack.conf > PreemptMode = OFF > PreemptType = preempt/none > PriorityType = priority/basic > PrivateData = none > ProctrackType = proctrack/pgid > Prolog = (null) > PrologSlurmctld = (null) > PropagatePrioProcess = 0 > PropagateResourceLimits = ALL > PropagateResourceLimitsExcept = (null) > ResumeProgram = (null) > ResumeRate = 300 nodes/min > ResumeTimeout = 60 sec > ResvOverRun = 0 min > ReturnToService = 0 > SallocDefaultCommand = (null) > SchedulerParameters = (null) > SchedulerPort = 7321 > SchedulerRootFilter = 1 > SchedulerTimeSlice = 30 sec > SchedulerType = sched/backfill > SelectType = select/linear > SlurmUser = root(0) > SlurmctldDebug = 6 > SlurmctldLogFile = /var/log/slurm-llnl/slurmctld.log > SlurmSchedLogFile = (null) > SlurmctldPort = 6817 > SlurmctldTimeout = 300 sec > SlurmdDebug = 5 > SlurmdLogFile = /var/log/slurm-llnl/slurmd.log > SlurmdPidFile = /var/run/slurmd.pid > SlurmdPort = 6818 > SlurmdSpoolDir = /tmp/slurmd > SlurmdTimeout = 300 sec > SlurmdUser = root(0) > SlurmSchedLogLevel = 0 > SlurmctldPidFile = /var/run/slurmctld.pid > SLURM_CONF = /etc/slurm-llnl/slurm.conf > SLURM_VERSION = 2.3.2 > SrunEpilog = (null) > SrunProlog = (null) > StateSaveLocation = /tmp > SuspendExcNodes = (null) > SuspendExcParts = (null) > SuspendProgram = (null) > SuspendRate = 60 nodes/min > SuspendTime = NONE > SuspendTimeout = 30 sec > SwitchType = switch/none > TaskEpilog = (null) > TaskPlugin = task/none > TaskPluginParam = (null type) > TaskProlog = (null) > TmpFS = /tmp > TopologyPlugin = topology/none > TrackWCKey = 0 > TreeWidth = 50 > UsePam = 0 > UnkillableStepProgram = (null) > UnkillableStepTimeout = 60 sec > VSizeFactor = 0 percent > WaitTime = 0 sec > Slurmctld(primary/backup) at SLURM0/(NULL) are UP/DOWN > > miao@SLURM0:~$ sinfo > PARTITION AVAIL TIMELIMIT NODES STATE NODELIST > debug* up infinite 2 idle SLURM[1-2] > > miao@SLURM0:~$ cat /var/log/slurm-llnl/slurmctld.log > [2014-01-04T16:59:15] debug2: Processing RPC: REQUEST_SUBMIT_BATCH_JOB > from uid=1000 > [2014-01-04T16:59:15] debug2: found 2 usable nodes from config containing > SLURM[1-2] > [2014-01-04T16:59:15] debug2: sched: JobId=74 allocated resources: > NodeList=(null) > [2014-01-04T16:59:15] _slurm_rpc_submit_batch_job JobId=74 usec=456 > [2014-01-04T16:59:15] debug: sched: Running job scheduler > [2014-01-04T16:59:15] debug2: found 2 usable nodes from config containing > SLURM[1-2] > [2014-01-04T16:59:15] sched: Allocate JobId=74 NodeList=SLURM1 #CPUs=1 > [2014-01-04T16:59:15] debug2: Spawning RPC agent for msg_type 4005 > [2014-01-04T16:59:15] debug2: got 1 threads to send out > [2014-01-04T16:59:15] debug2: Tree head got back 0 looking for 1 > [2014-01-04T16:59:15] debug2: Tree head got back 1 > [2014-01-04T16:59:15] debug2: Tree head got them all > [2014-01-04T16:59:15] debug2: Processing RPC: > REQUEST_JOB_ALLOCATION_INFO_LITE from uid=1000 > [2014-01-04T16:59:15] debug: _slurm_rpc_job_alloc_info_lite JobId=74 > NodeList=SLURM1 usec=73 > [2014-01-04T16:59:15] debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from > uid=1000 > [2014-01-04T16:59:15] debug: Configuration for job 74 complete > [2014-01-04T16:59:15] debug: laying out the 1 tasks on 1 hosts SLURM1 > dist 1 > [2014-01-04T16:59:15] sched: _slurm_rpc_job_step_create: StepId=74.0 > SLURM1 usec=490 > [2014-01-04T16:59:15] debug: Processing RPC: REQUEST_STEP_COMPLETE for > 74.0 nodes 0-0 rc=0 uid=0 > [2014-01-04T16:59:15] sched: _slurm_rpc_step_complete StepId=74.0 usec=106 > [2014-01-04T16:59:15] debug2: Processing RPC: > REQUEST_JOB_ALLOCATION_INFO_LITE from uid=1000 > [2014-01-04T16:59:15] debug: _slurm_rpc_job_alloc_info_lite JobId=74 > NodeList=SLURM1 usec=307 > [2014-01-04T16:59:15] debug2: Processing RPC: REQUEST_JOB_STEP_CREATE from > uid=1000 > [2014-01-04T16:59:15] debug: laying out the 1 tasks on 1 hosts SLURM1 > dist 1 > [2014-01-04T16:59:15] sched: _slurm_rpc_job_step_create: StepId=74.1 > SLURM1 usec=592 > [2014-01-04T16:59:15] debug: Processing RPC: REQUEST_STEP_COMPLETE for > 74.1 nodes 0-0 rc=0 uid=0 > [2014-01-04T16:59:15] sched: _slurm_rpc_step_complete StepId=74.1 usec=104 > [2014-01-04T16:59:15] debug2: node_did_resp SLURM1 > [2014-01-04T16:59:15] debug2: Processing RPC: > REQUEST_COMPLETE_BATCH_SCRIPT from uid=0 JobId=74 > [2014-01-04T16:59:15] completing job 74 > [2014-01-04T16:59:15] debug2: Spawning RPC agent for msg_type 6011 > [2014-01-04T16:59:15] sched: job_complete for JobId=74 successful > [2014-01-04T16:59:15] debug2: _slurm_rpc_complete_batch_script JobId=74 > usec=224 > [2014-01-04T16:59:15] debug2: got 1 threads to send out > [2014-01-04T16:59:15] debug2: Tree head got back 0 looking for 1 > [2014-01-04T16:59:15] debug2: Tree head got back 1 > [2014-01-04T16:59:15] debug2: Tree head got them all > [2014-01-04T16:59:16] debug2: node_did_resp SLURM1 > > miao@SLURM0:~$ srun -n2 -l hostname > 1: SLURM2 > 0: SLURM1 > miao@SLURM0:~$ > > Thank you very much! > Best regards > > Ian Malcolm >