Hi, What is the state of your slurmctld ?
Le 25/08/2015 12:51, Fahad Ibrahim Alzannan a écrit : > Hi, > > > We have a cluster and some nodes are down we tried to set them idle > using "scontrol update NodeName=xx State=idle" but they get back to > down also we tried to troubleshoot them using SLURM Troubleshooting > Guide but unfortunately nodes are still not working. > > > Here are commands outputs > > > Sinfo output is :sinfo: error: slurm_receive_msg: Zero Bytes were > transmitted or received > slurm_load_partitions: Zero Bytes were transmitted or received. > > > scontrol ping output is: scontrol: error: slurm_receive_msg: Zero > Bytes were transmitted or received > Slurmctld(primary/backup) at master/(NULL) are DOWN/DOWN > ************************************************************** > ** RESTORE SLURMCTLD DAEMON TO SERVICE ** > ************************************************************** > > > Regrads > Fahad Alzannan > CTAM > KACST > > > > *Warning: *This message and its attachment, if any, are confidential > and may contain information protected by law. If you are not the > intended recipient, please contact the sender immediately and delete > the message and its attachment, if any. You should not copy the > message and its attachment, if any, or disclose its contents to any > other person or use it for any purpose. Statements and opinions > expressed in this e-mail and its attachment, if any, are those of the > sender, and do not necessarily reflect those of King Abdulaziz city > for Science and Technology (KACST) in the Kingdom of Saudi Arabia. > KACST accepts no liability for any damage caused by this email. > > > > *تحذير:*هذه الرسالة وما تحويه من مرفقات (إن وجدت) تمثل وثيقة سرية قد > تحتوي على معلومات محمية بموجب القانون. إذا لم تكن الشخص المعني بهذه > الرسالة فيجب عليك تنبيه المُرسل بخطأ وصولها إليك، وحذف الرسالة > ومرفقاتها (إن وجدت)، ولا يجوز لك نسخ أو توزيع هذه الرسالة أو مرفقاتها > (إن وجدت) أو أي جزء منها، أو البوح بمحتوياتها للغير أو استعمالها لأي > غرض. علماً بأن فحوى هذه الرسالة ومرفقاتها (ان وجدت) تعبر عن رأي المُرسل > وليس بالضرورة رأي مدينة الملك عبدالعزيز للعلوم والتقنية بالمملكة > العربية السعودية، ولا تتحمل المدينة أي مسئولية عن الأضرار الناتجة عن > ما قد يحتويه هذا البريد. > -- --- Mehdi Denou International HPC support +336 45 57 66 56
