Hi all, When I ran slurm, in the slurmd, I got the problem
"slurmd: error: Error reading return code message from slurmstepd: rc=0: No error" I don't know what's the problem, which bothers me long time. Does anyone have a clue about what's going on? Below is the slurmd log. 1 [2013-04-16T17:03:48-06:00] Node configuration differs from hardware: CPUs=1:4(hw) Boards=1:1(hw) Sockets=1:2(hw) CoresPerSocket=1:1(hw) ThreadsPerCore=1:2(hw) 2 [2013-04-16T17:03:48-06:00] debug3: Trying to load plugin /usr/local/lib/slurm/topology_none.so 3 [2013-04-16T17:03:48-06:00] topology NONE plugin loaded 4 [2013-04-16T17:03:48-06:00] debug3: Success. 5 [2013-04-16T17:03:48-06:00] Gathering cpu frequency information for 4 cpus 6 [2013-04-16T17:03:48-06:00] debug: cpu_freq_init: cpu 0, reset freq: 2800000, reset governor: ondemand 7 [2013-04-16T17:03:48-06:00] debug: cpu_freq_init: cpu 1, reset freq: 2800000, reset governor: ondemand 8 [2013-04-16T17:03:48-06:00] debug: cpu_freq_init: cpu 2, reset freq: 2800000, reset governor: ondemand 9 [2013-04-16T17:03:48-06:00] debug: cpu_freq_init: cpu 3, reset freq: 2800000, reset governor: ondemand 10 [2014-04-16T17:03:48-06:00] debug3: NodeName = localhost 11 [2013-04-16T17:03:48-06:00] debug3: TopoAddr = localhost 12 [2013-04-16T17:03:48-06:00] debug3: TopoPattern = node 13 [2013-04-16T17:03:48-06:00] debug3: CacheGroups = 0 14 [2013-04-16T17:03:48-06:00] debug3: Confile = `/usr/local/etc/slurm.conf' 15 [2013-04-16T17:03:48-06:00] debug3: Debug = 3 16 [2013-04-16T17:03:48-06:00] debug3: CPUs = 1 (CF: 1, HW: 4) 17 [2013-04-16T17:03:48-06:00] debug3: Boards = 1 (CF: 1, HW: 1) 18 [2013-04-16T17:03:48-06:00] debug3: Sockets = 1 (CF: 1, HW: 2) 19 [2013-04-16T17:03:48-06:00] debug3: Cores = 1 (CF: 1, HW: 1) 20 [2013-04-16T17:03:48-06:00] debug3: Threads = 1 (CF: 1, HW: 2) 21 [2013-04-16T17:03:48-06:00] debug3: UpTime = 97626 = 1-03:07:06 22 [2013-04-16T17:03:48-06:00] debug3: Block Map = 0,2,1,3 23 [2013-04-16T17:03:48-06:00] debug3: Inverse Map = 0,2,1,3 24 [2013-04-16T17:03:48-06:00] debug3: RealMemory = 7986 25 [2013-04-16T17:03:48-06:00] debug3: TmpDisk = 144232 26 [2013-04-16T17:03:48-06:00] debug3: Epilog = `(null)' 27 [2013-04-16T17:03:48-06:00] debug3: Logfile = `/tmp/slurmd.log' 28 [2013-04-16T17:03:48-06:00] debug3: HealthCheck = `(null)' 29 [2013-04-16T17:03:48-06:00] debug3: NodeName = localhost 30 [2013-04-16T17:03:48-06:00] debug3: NodeAddr = (null) 31 [2013-04-16T17:03:48-06:00] debug3: Port = 6818 32 [2013-04-16T17:03:48-06:00] debug3: Prolog = `(null)' 33 [2013-04-16T17:03:48-06:00] debug3: TmpFS = `/tmp' 34 [2013-04-16T17:03:48-06:00] debug3: Public Cert = `whatever' 35 [2013-04-16T17:03:48-06:00] debug3: Slurmstepd = `/usr/local/sbin/slurmstepd' 36 [2013-04-16T17:03:48-06:00] debug3: Spool Dir = `/tmp/slurmd' 37 [2013-04-16T17:03:48-06:00] debug3: Pid File = `/tmp/slurmd.pid' 38 [2013-04-16T17:03:48-06:00] debug3: Slurm UID = 1002 39 [2013-04-16T17:03:48-06:00] debug3: TaskProlog = `(null)' 40 [2013-04-16T17:03:48-06:00] debug3: TaskEpilog = `(null)' 41 [2013-04-16T17:03:48-06:00] debug3: TaskPluginParam = 0 42 [2013-04-16T17:03:48-06:00] debug3: Use PAM = 0 43 [2013-04-16T17:03:48-06:00] debug3: Trying to load plugin /usr/local/lib/slurm/jobacct_gather_none.so 44 [2013-04-16T17:03:48-06:00] Job accounting gather NOT_INVOKED plugin loaded 45 [2013-04-16T17:03:48-06:00] debug3: Success. 46 [2013-04-16T17:03:48-06:00] debug3: Trying to load plugin /usr/local/lib/slurm/proctrack_pgid.so 47 [2013-04-16T17:03:48-06:00] debug3: Success. 48 [2013-04-16T17:03:48-06:00] debug3: Trying to load plugin /usr/local/lib/slurm/task_none.so 49 [2013-04-16T17:03:48-06:00] task NONE plugin loaded 50 [2013-04-16T17:03:48-06:00] debug3: Success. 51 [2013-04-16T17:03:48-06:00] debug3: Trying to load plugin /usr/local/lib/slurm/auth_none.so 52 [2013-04-16T17:03:48-06:00] Null authentication plugin loaded 53 [2013-04-16T17:03:48-06:00] debug3: Success. 54 [2013-04-16T17:03:48-06:00] debug: spank: opening plugin stack /usr/local/etc/plugstack.conf 55 [2013-04-16T17:03:48-06:00] debug3: Trying to load plugin /usr/local/lib/slurm/crypto_munge.so 56 [2013-04-16T17:03:48-06:00] Munge cryptographic signature plugin loaded 57 [2013-04-16T17:03:48-06:00] debug3: Success. 58 [2013-04-16T17:03:48-06:00] debug3: initializing slurmd spool directory 59 [2013-04-16T17:03:48-06:00] debug3: slurmd initialization successful 60 [2013-04-16T17:03:48-06:00] slurmd version 2.5.3 started 61 [2013-04-16T17:03:48-06:00] debug3: finished daemonize 62 [2013-04-16T17:03:48-06:00] debug3: Trying to load plugin /usr/local/lib/slurm/switch_none.so 63 [2013-04-16T17:03:48-06:00] switch NONE plugin loaded 64 [2013-04-16T17:03:48-06:00] debug3: Success. 65 [2013-04-16T17:03:48-06:00] debug3: successfully opened slurm listen port *:6818 66 [2013-04-16T17:03:48-06:00] slurmd started on Tue, 16 Apr 2013 17:03:48 -0600 67 [2013-04-16T17:03:48-06:00] Procs=1 Boards=1 Sockets=1 Cores=1 Threads=1 Memory=7986 TmpDisk=144232 Uptime=97626 68 [2013-04-16T17:03:48-06:00] debug3: Trying to load plugin /usr/local/lib/slurm/acct_gather_energy_none.so 69 [2013-04-16T17:03:48-06:00] AcctGatherEnergy NONE plugin loaded 70 [2013-04-16T17:03:48-06:00] debug3: Success. 71 [2013-04-16T17:03:49-06:00] debug3: in the service_connection 72 [2013-04-16T17:03:49-06:00] debug2: got this type of message 6001 73 [2013-04-16T17:03:49-06:00] debug2: Processing RPC: REQUEST_LAUNCH_TASKS 74 [2013-04-16T17:03:49-06:00] debug: task_slurmd_launch_request: 4294938102.1239378013 0 75 [2013-04-16T17:03:49-06:00] launch task 4294938102.1239378013 request from [email protected] (port 16014) 76 [2013-04-16T17:03:49-06:00] debug: Calling /usr/local/sbin/slurmstepd spank prolog 77 [2013-04-16T17:03:49-06:00] Reading slurm.conf file: /usr/local/etc/slurm.conf 78 [2013-04-16T17:03:49-06:00] Running spank/prolog for jobid [4294938102] uid [1000] 79 [2013-04-16T17:03:49-06:00] spank: opening plugin stack /usr/local/etc/plugstack.conf 80 [2013-04-16T17:03:49-06:00] debug3: _rpc_launch_tasks: call to _forkexec_slurmstepd 81 [2013-04-16T17:03:49-06:00] debug3: slurmstepd rank 0 (localhost), parent rank -1 (NONE), children 0, depth 0, max_depth 0 82 [2013-04-16T17:03:49-06:00] debug3: _send_slurmstepd_init: call to getpwuid_r 83 [2013-04-16T17:03:49-06:00] debug3: _send_slurmstepd_init: return from getpwuid_r 84 [2013-04-16T17:03:49-06:00] debug level is 8. 85 [2013-04-16T17:03:49-06:00] Trying to load plugin /usr/local/lib/slurm/jobacct_gather_none.so 86 [2013-04-16T17:03:49-06:00] Job accounting gather NOT_INVOKED plugin loaded 87 [2013-04-16T17:03:49-06:00] Success. 88 [2013-04-16T17:03:49-06:00] jobacct_gather dynamic logging enabled 89 [2013-04-16T17:03:49-06:00] Trying to load plugin /usr/local/lib/slurm/switch_none.so 90 [2013-04-16T17:03:49-06:00] switch NONE plugin loaded 91 [2013-04-16T17:03:49-06:00] Success. 92 [2013-04-16T17:03:49-06:00] slurmstepd rank 0, parent address = 0.0.0.0, port = 0 93 [2013-04-16T17:03:49-06:00] Received cpu frequency information for 4 cpus 94 [2013-04-16T17:03:49-06:00] setup for a launch_task 95 [2013-04-16T17:03:49-06:00] entering job_create 96 [2013-04-16T17:03:49-06:00] error: Error reading return code message from slurmstepd: rc=0: No error 97 [2013-04-16T17:03:49-06:00] debug3: _rpc_launch_tasks: return from _forkexec_slurmstepd 98 [2013-04-16T17:05:15-06:00] got shutdown request 99 [2013-04-16T17:05:15-06:00] all threads complete 100 [2013-04-16T17:05:15-06:00] Munge cryptographic signature plugin unloaded 101 [2013-04-16T17:05:15-06:00] Slurmd shutdown completing Any tip would be helpful. Thank you! Ke -- -- ============================================================ Ke Wang, Ph.D Candidate. Department of Computer Science, Illinois Institute of Technology (IIT) ============================================================ Data-Intensive Distributed Systems Laboratory, CS/IIT ============================================================ *Cel:* 1-312-532-2105 *Addr:* 10 W. 31st Street Stuart Building, Room 003B Chicago, IL 60616 *Email:* *[email protected] <[email protected]>* *[email protected] *** *Web:* http://datasys.cs.iit.edu/~kewang/ <http://datasys.cs.iit.edu/> http://datasys.cs.iit.edu/ ============================================================ ============================================================
