Al Chu <[email protected]> writes: > Let's try some tests. Could you run bmc-watchdog "by hand" to make sure > things look like it's working right? "by hand", I mean something like > run: > > bmc-watchdog --get (see what the current watchdog settings are) > bmc-watchdog --set ... (with same as deamon options, except not the > reset interval '-e 60') > bmc-watchdog --get (see that things are set) > bmc-watchdog --start > bmc-watchdog --get (make sure things changed, timer is running) > bmc-watchdog --get (make sure timer is counting down) > bmc-watchdog --reset > bmc-watchdog --get (make sure timer has reset) > > (and you probably want to do bmc-watchdog --stop at the end)
I should have said I was puzzled by when it says Stopped. This is a RH5, Sun ILOM 2 system (not ELOM as I thinko'd before). # bmc-watchdog --get Timer Use: SMS/OS Timer: Stopped Logging: Enabled Timeout Action: Hard Reset Pre-Timeout Interrupt: None Pre-Timeout Interval: 0 seconds Timer Use BIOS FRB2 Flag: Set Timer Use BIOS POST Flag: Set Timer Use BIOS OS Load Flag: Set Timer Use BIOS SMS/OS Flag: Set Timer Use BIOS OEM Flag: Set Initial Countdown: 900 seconds Current Countdown: 900 seconds # bmc-watchdog --set -u 4 -p 0 -a 1 -i 900 # bmc-watchdog --get Timer Use: SMS/OS Timer: Stopped Logging: Enabled Timeout Action: Hard Reset Pre-Timeout Interrupt: None Pre-Timeout Interval: 0 seconds Timer Use BIOS FRB2 Flag: Set Timer Use BIOS POST Flag: Set Timer Use BIOS OS Load Flag: Set Timer Use BIOS SMS/OS Flag: Set Timer Use BIOS OEM Flag: Set Initial Countdown: 900 seconds Current Countdown: 900 seconds # bmc-watchdog --start # bmc-watchdog --get Timer Use: SMS/OS Timer: Stopped Logging: Enabled Timeout Action: Hard Reset Pre-Timeout Interrupt: None Pre-Timeout Interval: 0 seconds Timer Use BIOS FRB2 Flag: Clear Timer Use BIOS POST Flag: Clear Timer Use BIOS OS Load Flag: Clear Timer Use BIOS SMS/OS Flag: Clear Timer Use BIOS OEM Flag: Clear Initial Countdown: 900 seconds Current Countdown: 900 seconds # sleep 2 # bmc-watchdog --get Timer Use: SMS/OS Timer: Stopped Logging: Enabled Timeout Action: Hard Reset Pre-Timeout Interrupt: None Pre-Timeout Interval: 0 seconds Timer Use BIOS FRB2 Flag: Clear Timer Use BIOS POST Flag: Clear Timer Use BIOS OS Load Flag: Clear Timer Use BIOS SMS/OS Flag: Clear Timer Use BIOS OEM Flag: Clear Initial Countdown: 900 seconds Current Countdown: 898 seconds # bmc-watchdog --reset # bmc-watchdog --get Timer Use: SMS/OS Timer: Stopped Logging: Enabled Timeout Action: Hard Reset Pre-Timeout Interrupt: None Pre-Timeout Interval: 0 seconds Timer Use BIOS FRB2 Flag: Clear Timer Use BIOS POST Flag: Clear Timer Use BIOS OS Load Flag: Clear Timer Use BIOS SMS/OS Flag: Clear Timer Use BIOS OEM Flag: Clear Initial Countdown: 900 seconds Current Countdown: 900 seconds > This can help us isolate things. If the above works, then maybe there > is a timing issue within your BMC that we need to get around. I'm a > little perplexed as to why it would work with the openipmi driver. It's > possible it's more generous on some timeouts of packets and such. Or > maybe the openipmi driver's own watchdog implementation/code has done > something to massage the BMC that I'm unaware of. I probably wasn't clear. What I meant was: # bmc-watchdog -g --config-file /dev/null ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7) ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7) ipmi-kcs-driver.c: 858: ipmi_kcs_read: error 'BMC busy' (7) ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7) ipmi-kcs-driver.c: 749: ipmi_kcs_write: error 'BMC busy' (7) ipmi-kcs-driver.c: 858: ipmi_kcs_read: error 'BMC busy' (7) bmc-watchdog: Get Watchdog Timer Error: BMC Busy in contrast to: # bmc-watchdog -g --config-file /dev/null -D OPENIPMI|head -1 Timer Use: SMS/OS ... and # bmc-info --config-file /dev/null Device ID : 32 ... Actually now it's obvious there's something wrong with the ILOM, thanks. I've now tried on an x2200M2 with ELOM with the results below (and I don't have to specify the openipmi driver). I guess I won't get anywhere with a service request on this -- especially as I'm only doing it because Sun couldn't fix the hangups on the Thumper -- but perhaps you have a simple idea for a fix? # bmc-watchdog --get Timer Use: SMS/OS Timer: Stopped Logging: Enabled Timeout Action: Hard Reset Pre-Timeout Interrupt: None Pre-Timeout Interval: 0 seconds Timer Use BIOS FRB2 Flag: Clear Timer Use BIOS POST Flag: Clear Timer Use BIOS OS Load Flag: Clear Timer Use BIOS SMS/OS Flag: Clear Timer Use BIOS OEM Flag: Clear Initial Countdown: 900 seconds Current Countdown: 0 seconds # bmc-watchdog --set -u 4 -p 0 -a 1 -i 900 # bmc-watchdog --get Timer Use: SMS/OS Timer: Stopped Logging: Enabled Timeout Action: Hard Reset Pre-Timeout Interrupt: None Pre-Timeout Interval: 0 seconds Timer Use BIOS FRB2 Flag: Clear Timer Use BIOS POST Flag: Clear Timer Use BIOS OS Load Flag: Clear Timer Use BIOS SMS/OS Flag: Clear Timer Use BIOS OEM Flag: Clear Initial Countdown: 900 seconds Current Countdown: 0 seconds # bmc-watchdog --start # bmc-watchdog --get Timer Use: SMS/OS Timer: Running Logging: Enabled Timeout Action: Hard Reset Pre-Timeout Interrupt: None Pre-Timeout Interval: 0 seconds Timer Use BIOS FRB2 Flag: Clear Timer Use BIOS POST Flag: Clear Timer Use BIOS OS Load Flag: Clear Timer Use BIOS SMS/OS Flag: Clear Timer Use BIOS OEM Flag: Clear Initial Countdown: 900 seconds Current Countdown: 900 seconds # sleep 2 # bmc-watchdog --get Timer Use: SMS/OS Timer: Running Logging: Enabled Timeout Action: Hard Reset Pre-Timeout Interrupt: None Pre-Timeout Interval: 0 seconds Timer Use BIOS FRB2 Flag: Clear Timer Use BIOS POST Flag: Clear Timer Use BIOS OS Load Flag: Clear Timer Use BIOS SMS/OS Flag: Clear Timer Use BIOS OEM Flag: Clear Initial Countdown: 900 seconds Current Countdown: 898 seconds # bmc-watchdog --reset # bmc-watchdog --get Timer Use: SMS/OS Timer: Running Logging: Enabled Timeout Action: Hard Reset Pre-Timeout Interrupt: None Pre-Timeout Interval: 0 seconds Timer Use BIOS FRB2 Flag: Clear Timer Use BIOS POST Flag: Clear Timer Use BIOS OS Load Flag: Clear Timer Use BIOS SMS/OS Flag: Clear Timer Use BIOS OEM Flag: Clear Initial Countdown: 900 seconds Current Countdown: 899 seconds _______________________________________________ Freeipmi-users mailing list [email protected] http://lists.gnu.org/mailman/listinfo/freeipmi-users
