Bug#688521: SILO first boot after power-on or reset fails on Netra T1 200
All the drives I've tested are 9Gb SCA. Of the discs available to me (WD, Seagate, Fujitsu, IBM) no disc that has been branded Compaq works adequately, while discs that have been branded Sun or are generic (e.g. straight from Seagate) are OK. I have never seen this problem on any other systems, Sun or otherwise. My suspicion is that the variant/version of OBP on these machines recognises that a disc has been badged by an OEM, assumes that it's Sun without checking in further detail, and tries to do something Sun-specific. I think this issue should probably be closed since I don't think it's a Linux or SILO problem. The most that can be done is make sure a warning is available in an appropriate place in platform-specific notes. -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#688521: SILO first boot after power-on or reset fails on Netra T1 200
This is something to do with the type of disc being used, i.e. the exact firmware version or similar, rather than the Debian release or the version of SILO etc. I'm using non-Sun SCA discs, badged Compaq but apparently WD. Some work OK while others- nominally the same but with visibly different PCB- fail the first boot. This might still be specific to OBP in a Netra T1, both types of disc boot first time in an Ultra-1. It affects Lenny, Squeeze and Wheezy, and possibly other OSes. -- Mark Morgan Lloyd markMLl .AT. telemetry.co .DOT. uk [Opinions above are the author's, not those of his employers or colleagues] -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#688521: SILO first boot after power-on or reset fails on Netra T1 200
Looking at the output you see, I have doubts that it has anything to do with SILO though. SILO prints letters 'S', 'I', 'L' and 'O' (appearing before the prompt) after it completes execution of different parts of first-stage loader. As you can see in the code (first/first.S), printing 'S' is the first thing first-stage loader does upon startup. The fact that it is not seen in the console output suggests that even first-stage loader never got to run. The line Boot device: /pci@1f,0/pci@1/scsi@8/disk@0,0:a File and args: which is normally printed by OBP before control is passed to SILO does not appear in the watchdog-reset case either, which, again, is a strong sign that failure happens before SILO has a chance to run. OK, but it still boots Squeeze without complaint. And complains when booting Lenny. In a failure case, how long does it take between you typing 'boot' and watchdog reset message being displayed? About a second. This doc http://docs.oracle.com/cd/E19102-01/n240.srvr/817-5481-11/understanding_wdtimer.html appears to suggest that stuck watchdog would initiate a XIR after 60 seconds by default, is it consistent with what you see? What are the values of various variables mentioned there on your system(s)? Does increasing the timeout help? As far as I can see that's applicable to Solaris and ALOM. The T1 200 uses the lomlite2 chip. I really can't come up with any reason why it would work for Squeeze but not other releases, so testing all suspect SILO versions on the same machine would be an interesting experiment. Working backwards using silo_1.4.14+git20120819-1_sparc.deb silo_1.4.14+git20100228-1+b1_sparc.deb silo_1.4.13a+git20070930-3_sparc.deb silo_1.4.13-1_sparc.deb resulted in no change in symptoms. Trying to use silo_1.4.9-1_sparc.deb resulted in a system which dumped me straight into BusyBox. Putting the Squeeze disc back into the system at that point still worked without complaint. In case I was doing anything obviously wrong, I was getting the .deb using wget and then installing using dpkg -i I take Richard's point about it not being caused directly by the LOM chip (nothing in its log). The fact that Squeeze (still) works suggests that OBP and its variables including nvramrc aren't directly involved. I take your point about SILO not being displayed. Observation (manual transcript follows): # Booting Squeeze: OpenBoot 4.0 [...] Ethernet address [...] ok boot Bad magic number in disk label Can't open disk label package Boot device: disk File and args: SILO version 1.4.14 Boot: # i.e. that works without complaint. Booting Wheezy: OpenBoot 4.0 [...] Ethernet address [...] ok boot [Hex dump here] Watchdog Reset Externally Initiated Reset ok boot Boot device: /pci@1f,0/pci@1/scsi@8/disk@0,0:a File and args: SILO Version 1.4.[...] boot: I need to go back and check Lenny (which fails to boot) again, that 'Can't open disk label package' message might be significant. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#688521: SILO first boot after power-on or reset fails on Netra T1 200
Package: silo Version: 1.4.14+git20120819-1 Severity: normal On a Netra T1 200 with default installation (no desktop packages etc.) and with a serial terminal attached to the LOM port as console, the first boot command after power-on, reset or reboot fails with a watchdog timeout before SILO presents its boot prompt: lom lomversion LOM version:v3.10 LOM checksum: a068 LOM firmware part# 258-7871-16 Microcontroller:H8/3437S LOM firmware build Apr 3 2001 13:04:44 lompoweron lom LOM event: +4h4m40s host power on Netra T1 200 (UltraSPARC-IIe 500MHz), No Keyboard OpenBoot 4.0, 1024 MB memory installed, Serial #51358633. Ethernet address 0:3:ba:f:ab:a9, Host ID: 830faba9. ok boot 0015f000f5b8 000d009730a100d61b8ffe9d00050010 f0004200f000420400040010f0004200f000420400030010 f0004200f0004204000200100070807080040001013e f000a860f000a864 Watchdog Reset Externally Initiated Reset ok boot Boot device: /pci@1f,0/pci@1/scsi@8/disk@0,0:a File and args: SILO Version 1.4.14 boot: Allocated 64 Megs of memory at 0x4000 for kernel Uncompressing image... [Successful boot here.] Lomlite2 firmware is at 3.10, there are no watchdog entries in its log. LOM firmware can't be upgraded without using a non-free Solaris package. This affects SILO on Wheezy and Lenny, but Squeeze boots despite having the same version as Wheezy: Lenny: 1.4.13 Squeeze: 1.4.14 Wheezy: 1.4.14 Kernel and configuration as shipped on CD. Behavior is predictable, and the same over several computers of the same model. I have not tried building any custom kernels, so can't say whether it depends crucially on e.g. the size or number of the kernels in silo.conf. -- System Information: Debian Release: wheezy/sid APT prefers testing APT policy: (500, 'testing') Architecture: sparc (sparc64) Kernel: Linux 3.2.0-3-sparc64 Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages silo depends on: ii libc6 2.13-35 silo recommends no packages. silo suggests no packages. -- Configuration Files: /etc/silo.conf changed: root=/dev/sda2 partition=1 default=Linux read-only timeout=100 image=/vmlinuz label=Linux initrd=/initrd.img image=/vmlinuz.old label=LinuxOLD initrd=/initrd.img.old -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org