Hi Kang-sen, You had either network or disk problem. The same problem happened more times before IMMND was killed by NID. First start: Jan 4 12:28:48 BHA-IND-MUM-MALAD-CAE-8 opensafd: Starting OpenSAF Services .... Jan 4 12:28:54 BHA-IND-MUM-MALAD-CAE-8 osafimmnd[53251]: Started .... Jan 4 12:33:24 BHA-IND-MUM-MALAD-CAE-8 opensafd: Stopping OpenSAF Services <----- stopping OpenSAF was not initiated by OpenSAF ??????
Second try: Jan 4 12:34:06 BHA-IND-MUM-MALAD-CAE-8 opensafd: Starting OpenSAF Services .... Jan 4 12:34:12 BHA-IND-MUM-MALAD-CAE-8 osafimmnd[57640]: Started .... Jan 4 12:36:02 BHA-IND-MUM-MALAD-CAE-8 opensafd: Stopping OpenSAF Services ... and so on.... In the last try you have the case where NID killed IMMND because it didn't start in 8 minutes which is the default IMMND timeout in NID (check /etc/opensaf/nodeinit.conf.payload). Jan 4 12:44:20 BHA-IND-MUM-MALAD-CAE-8 osafimmnd[65304]: Started .... Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 opensafd[65278]: ER Timed-out for response from IMMND <------- timeout after 8 minutes Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 opensafd[65278]: ER Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 opensafd[65278]: ER Going for recovery Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 opensafd[65278]: ER Trying To RESPAWN /usr/lib/opensaf/clc-cli/osaf-immnd attempt #1 Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 opensafd[65278]: ER Sending SIGKILL to IMMND, pid=65297 Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 osafimmnd[65304]: exiting for shutdown Jan 4 12:52:35 BHA-IND-MUM-MALAD-CAE-8 osafimmnd[1977]: Started There is no any syslog after osafimmnd started, so I assume that you had a network issue where IMMND was waiting to finish IMMND initialization. Another issue might be a disk problem but it's unlikely since you have syslogs written to the disk. Thanks, Zoran -----Original Message----- From: Kang-Sen Lu [mailto:[email protected]] Sent: den 5 januari 2017 14:23 To: Zoran Milinkovic <[email protected]>; [email protected] Subject: RE: [users] question about payload blade osafimmnd startup problem Hi, Zoran: Thank for your reply. I am sending you the syslog from "Starting Opensaf Service", up to the time we gave up. You can find the log from all opensaf components. However, we didn't turn on trace on immnd, so there is no trace log to provide you. Unfortunately, this problem is not reproduceable. After some time, the problem goes away. Kang-sen -----Original Message----- From: Zoran Milinkovic [mailto:[email protected]] Sent: Thursday, January 05, 2017 3:50 AM To: Kang-Sen Lu <[email protected]>; [email protected] Subject: RE: [users] question about payload blade osafimmnd startup problem Hi Kang-sen, The error indicates that IMMND was not started within a certain time, and NID killed IMMND. Please share logs before the error to see what exactly happened. If it's IMMND problem, traces will help more to analyze the problem. Thanks, Zoran -----Original Message----- From: Kang-Sen Lu [mailto:[email protected]] Sent: den 4 januari 2017 18:17 To: [email protected] Subject: [users] question about payload blade osafimmnd startup problem We are running opensaf 4.4.0, on a HP chassis. We are facing a payload blade (slot-8) have opensaf startup problem. Here is the relevant part of the syslog: Jan 4 12:44:20 BHA-IND-MUM-MALAD-CAE-8 kernel: [660527.561238] tipc: Activated (version 2.0.1.2) Jan 4 12:44:20 BHA-IND-MUM-MALAD-CAE-8 kernel: [660527.561327] NET: Registered protocol family 30 Jan 4 12:44:20 BHA-IND-MUM-MALAD-CAE-8 kernel: [660527.561444] tipc: Started in single node mode Jan 4 12:44:20 BHA-IND-MUM-MALAD-CAE-8 kernel: [660527.563034] tipc: Started in network mode Jan 4 12:44:20 BHA-IND-MUM-MALAD-CAE-8 kernel: [660527.563037] tipc: Own node address <1.1.129>, network identity 1234 Jan 4 12:44:20 BHA-IND-MUM-MALAD-CAE-8 kernel: [660527.565229] tipc: Enabled bearer <eth:bond0>, discovery domain <1.1.0>, priority 10 Jan 4 12:44:20 BHA-IND-MUM-MALAD-CAE-8 osafimmnd[65304]: Started Jan 4 12:44:20 BHA-IND-MUM-MALAD-CAE-8 kernel: [660528.240310] IPMI Watchdog: response: Error d5 on cmd 22 Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 opensafd[65278]: ER Timed-out for response from IMMND Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 opensafd[65278]: ER Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 opensafd[65278]: ER Going for recovery Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 opensafd[65278]: ER Trying To RESPAWN /usr/lib/opensaf/clc-cli/osaf-immnd attempt #1 Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 opensafd[65278]: ER Sending SIGKILL to IMMND, pid=65297 Jan 4 12:52:20 BHA-IND-MUM-MALAD-CAE-8 osafimmnd[65304]: exiting for shutdown Anybody can suggest how to fin dout what the problem is? Other payload blade did not have the same problem. Thanks. Kang-sen ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-users ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot _______________________________________________ Opensaf-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/opensaf-users
