Bug#509238: panic backtrace
Quoting: Christian Perrier bubu...@debian.org So, in short, in regular mode, it crashes (always at the same place) but in vga=771 mode, it doesn't, right? Correct. And I assume that you get no crash as well if you're using the graphical installer. I had not tried the graphical installer, figuring that the more basic the better, but I just did. It appears that just before going to graphical, the installer puts the screen in mode 771. I get the smaller, sharper fonts. So, no. It doesn't crash in the graphical installer. BTW, the graphical installer looks good! I'm puzzled to reassign this bug report. This is obviously a race condition somewhere. Probably a weird kernel issue but investigating further istricky and I fear that this bug report might remain uninvestigated further for quite a while until it magically solves in the future with a new kernel (though probably not for Lenny). That's probably the reasonable thing to do. I propose documenting this in the errata file, at the minimum. Yes. A prominent (easy to find) note that would say something like If your laptop crashes completely during installation, try replacing the boot option vga=normal with vga=771 or use the graphical installer would have saved some time, but then the problem would not have become known. Thank you very much for your help! -- To UNSUBSCRIBE, email to debian-boot-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#509238: panic backtrace
Quoting: Christian Perrier bubu...@debian.org Quoting The Eclectic One (eclec...@sdf.lonestar.org): First thought: race condition (the panic message contained a backtrace of different threads), so then I tried multiple times with only one change at at time: expert mode, regular mode, ethdetect -x, vga=771, vga normal. It turns out that the culprit was the vga option. With vga=771, I get no crash/panic in either expert or regular mode, with So, in short, in regular mode, it crashes (always at the same place) but in vga=771 mode, it doesn't, right? Correct. -- To UNSUBSCRIBE, email to debian-boot-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#509238: panic backtrace
Quoting Christian Perrier bubu...@debian.org Ok, tried a few more times. I usually get the same kernel panic screen, Did you try in expert mode, ie choosing it from the Advanced options in the boot menu. Yes, I tried expert as well as regular. In expert mode, when you reach the HW detection step, you'll get a question about PCMCIA options. They're not necessarily relevant but checking if the crash happens before or after it would help No question about PCMCIA before getting to the network detection screen/status bar. ... Some drivers in 2.6.18 provided firmware blobs that have been extracted from the source and are now provided as separate udebs: r...@mykerinos:~ apt-cache search firmware 2100 firmware-ipw2x00 - Binary firmware for Intel Pro Wireless 2100, 2200 and 2915 That explains the missing firmware. If that firmware is needed for ipw2100, you'll be prompted about thisbut you're not, which means the crash happens before..:-) Correct. Anything else I can do? More tests? Before the network devices screen (for instance, when prompted for language), could you switch to VT2 (Alt+F2) and, from there edit /bin/ethdetect.sh Add -x to the first line: #!/bin/sh -x Then go back to VT1, continue to the step where the crash happens and sitch to VT4 before it happens. As ethdetect.sh will be in debug mode, we'll see all its output and could narrow down the exact line where the crash happens. Ok, did that and also vga=771. I was hoping to get a smaller font to see more of the backtrace. This is what I got: [ output of ethdetect -x - lsifaces (3 lines) sed (4), grep (4) and sed (3) ] Dec 22 15:42:48 main-menu[1328]: (process:9628) + ip link set eth0 up Dec 22 15:42:48 main-menu[1328]: (process:9628) + ip link set eth0 down Dec 22 15:42:48 main-menu[1328]: (process:9628) + ip link set eth1 up Dec 22 15:42:48 main-menu[1328]: (process:9628) + ip link set eth1 down Dec 22 15:42:48 main-menu[1328]: (process:9628) + check-missing-firmware Dec 22 15:42:48 main-menu[1328]: (process:9628) + sysfs-update-devnames Dec 22 15:42:48 main-menu[1328]: (process:9628) + cleanup Dec 22 15:42:48 main-menu[1328]: (process:9628) + rm -f /tmp/devnames-static.txt Dec 22 15:42:48 main-menu[1328]: DEBUG: resolver (libslang2-udeb): package doesn't exist (ignored) It didn't crash! After that I proceeded to console 1 and did see the screen explaining that I needed the missing firmware - ipw2100-1.3.fw. As I had thought, even with the missing firmware, the culprit wasn't the wireless device. First thought: race condition (the panic message contained a backtrace of different threads), so then I tried multiple times with only one change at at time: expert mode, regular mode, ethdetect -x, vga=771, vga normal. It turns out that the culprit was the vga option. With vga=771, I get no crash/panic in either expert or regular mode, with sh -x in ethdetect or not. It might still be a race condition but if so it seems to be triggered by something related to the display. Very strange. The display works fine (with a big font) in normal mode. In 771 mode, the font is smaller and the curses windows sharper. Even though I now have a work-around, I'm willing to keep debugging if if would be useful. -- To UNSUBSCRIBE, email to debian-boot-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#509238: panic backtrace
Quoting Christian Perrier bubu...@debian.org OK. Are you in the position of testing with something else than a USB stick boot? Actually, before I gave up on CDs (ruined 11 CD-Rs, probably marginal media, drive or wodim problems) I had a CD made on a windows machine of the lenny installer RC1. This is what I just tried again. I get the same panic at the same point in the installation process. Of course the addresses are slightly different as the underlying kernel code is different. The best would be using the netboot ISO (called mini.iso) from a CD. ... Ok, it happens immediately after the identifying network hardware screen. So soon after in fact that I thought I would have no time to after identifying network HW means 'after the system displays a progress bar saying Identifying network hardware, right? Correct. This is where we would need to really narrow things down and where using the expert mode could help. Ok, tried a few more times. I usually get the same kernel panic screen, but on one occasion (in expert mode), it crashed a little differently and I saw this: Dec 21 17:21:46 kernel: [ 173.096981] ipw2100: Intel(R) PRO/Wireless 2100 Network Driver, git-1.2.2 Dec 21 17:21:46 kernel: [ 173.096987] ipw2100: Dec 21 17:21:46 kernel: [ 173.096987] ipw2100: Copyright(c) 2003-2006 Intel Corpration Dec 21 17:21:46 kernel: [ 173.206202] ACPI: PCI Interrupt :02:03:0[A] - Link [LKNB] - GSI 5 (level, low) - IRQ 5 Dec 21 17:21:46 kernel: [ 173.206206] ipw2100: Detected Intel PRO/WIreless 2100 Network Connection Dec 21 17:21:46 kernel: [ 173.206619] firmware: requesting ipw2100-1.3.fw Dec 21 17:21:46 net/hw-detect.hotplug: Detected hotpluggable network interface eth0 Dec 21 17:21:46 kernel: [ 173.349107] ipw2100: eth1: Firmware 'ipw2100-1.3.fw' not available or load failed. Dec 21 17:21:46 kernel: [ 173.349114] ipw2100: eth1: ipw2100_get_firmware failed: -2 Dec 21 17:21:46 kernel: [ 173.349118] ipw2100: eth1: Failed to power on the adapter. Dec 21 17:21:46 kernel: [ 173.349121] ipw2100: eth1: Failed to start the firmware. Dec 21 17:21:46 kernel: [ 173.349125] ipw2100Error calling register_netdev. Dec 21 17:21:46 kernel: [ 173.349443] ACPI: PCI interrupt for device :02:03.0 disabled Dec 21 17:21:46 kernel: [ 173.349450] ipw2100: probe of :02:03.0 failed with error -5 Dec 21 17:21:46 hw-detect: insmod /lib/modules/2.6.26-1-486/kernel/drivers ieee1394/sbp2.ko Dec 21 17:21:46 kernel: [ 173.480359] eth1394: eth1: IPv4 over IEEE 1394 (fw-host0) Dec 21 17:21:46 net/hw-detect.hotplug: Detected hotpluggable network interface e This is exactly as the screen froze. I wonder if the only line visible that refers to eth0 is when it actually detected the main (wired) ethernet interface. As this is the interface I want to use and it appears that it generated no errors, it seems to be a good sign. All the ipw2100 lines refer to the wireless interface, which works fine on 2.6.18. I never knew it required separate firmware. Since it appears not to be included in the installer, I presume I'd have to find it in the additional drivers media. In any case, the errors notwithstanding, it looks like the installer handled the wireless interface correctly. Am I right that it seems it's the eth1394 driver that is causing the crash? I've tried (based on the help example: hw-detect/start_pcmcia=false) to add hw-detect/start_eth1394=false but it doesn't seem to have an effect. It still crashes. BTW, the eth1394 driver is loaded without errors under 2.6.18, although it is not used. I have no fire-wire devices to test it. Early on, I passed the option to protect the firewire interface addresses (faffd800 - faffdfff) but it still crashed. I suppose it would help if I could get more lines in the console screens. What option can I pass the installer to have smaller type, or even better 2 side by side pages? I have a WUXGA (1920 X 1200) screen so it should be possible to see a much longer backtrace. ... OK. Thanks for the help trying to narrow things. Being an obvious problem with the kernel, we really need to triple check that it happens or not with the last kernel package from unstable (which is likely, but still...) If you point me to a boot.img.gz of the latest kernel, that I can zcat to the usb memory stick, I'd be happy to test it. Anything else I can do? More tests? -- To UNSUBSCRIBE, email to debian-boot-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#509238: panic backtrace
First of all, it would be nice if you could precise what exact version you tested. I tried both the rc1 debian-testing-i386-netinst.iso and the daily build as of a few days ago. Same result. The version we would like to see tested at this moment is: - RC1, which you can download from: hd-media image=A0: http://ftp.nl.debian.org/debian/dists/testing/main/installer-i386/current/i= mages/hd-media/ netinst ISO: http://cdimage.debian.org/cdimage/lenny_di_rc1/i386/iso-cd/debian-testing-i= 386-netinst.iso - Daily builds: hd-media: http://people.debian.org/~joeyh/d-i/images/daily/hd-media/ netinst ISO: http://cdimage.debian.org/cdimage/daily-builds/daily/arch-latest/i386/iso-c= d/debian-testing-i386-netinst.iso Ok, I downloaded these 4 again. It turns out that the daily build iso was the exact same I had tried last. The boot.imb.gz was different so I re-made the usb stick disk with the one in the daily build link above and the daily build iso. So the backtrace below applies to the daily builds above. Then we could probably narrow the problem by using the expert modewhich will interrupt some steps more often and make the exact moment the problem happens more easy to spot. Actually, I was able to capture the problem in non-expert mode. I figure the least amount of changes made in the flow of execution of the program, the easier it would be to duplicate the problem. I chose the defaults in the language selection, locale, etc... Also, when you've spotted the moment where the installer hangs, please retry the installation and, just before the hang happens, try switching to console 4 (Alt+F4) and look at what's displayed there... Ok, it happens immediately after the identifying network hardware screen. So soon after in fact that I thought I would have no time to swtich to console 4, but quick fingers did it and quite a bit scrolled off the screen, ending in this: [ 23.523397] EAX: EBX: df6f ECX: EDX: df413440 [ 23.523397] ESI: df6f EDI: EBP: ESP: df449f84 [ 23.523397] DS: 007b ES: 007b FS: GS: SS: 0068 [ 23.523397] Process events/0 (pid: 5, ti=df448000 task=df43e860 task.ti=df448000) [ 23.523397] Stack: c025d42d df6f 0002 c025830e df413440 c0356f00 c025833e [ 23.523397]c025835b c012716d df413440 c01276e9 df413448 c0127794 df43e860 [ 23.523397]c012990b df449fc8 df449fc8 df413440 c0129763 c012972d [ 23.523397] Call Trace: [ 23.523397] [c025d42d] dev_deactivate+0x1e/0xbd [ 23.523397] [c025830e] __linkwatch_run_queue+0x118/0x148 [ 23.523397] [c025833e] linkwatch_event+0x0/0x22 [ 23.523397] [c025835b] linkwatch_event+0x1d/0x22 [ 23.523397] [c012716d] run_workqueue+0x75/0xee [ 23.523397] [c01276e9] worker_thread+0x0/0xb5 [ 23.523397] [c0127794] worker_thread+0xab/0xb5 [ 23.523397] [c012990b] autoremove_wake_function=0x0/0x2d [ 23.523397] [c0129763] kthread+0x36/0x5b [ 23.523397] [c012972d] kthread+0x0/0x5b [ 23.523397] [c0104937] kernel_thread_helper+0x7/0x10 [ 23.523397] === [ 23.523397] Code: ff 42 50 5b c3 89 c1 8d 90 80 00 00 00 89 12 8d 81 a4 00 00 00 89 52 04 c7 42 08 00 00 00 00 83 c2 0c 39 c2 75 e7 31 c0 c3 89 c1 8b 40 10 8b 50 30 85 d2 74 04 89 c8 ff d2 c3 53 89 c3 e8 a8 af [ 23.523397] EIP: [c025d074] qdisc_reset+0x2/0x11 SS:ESP 0068:df449f84 [ 23.915430] Kernel panic - not syncing: Fatall exception in interrupt Of course, the machine is now totally frozen, so no scrolling back. Any other tests? Let me know. -- To UNSUBSCRIBE, email to debian-boot-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#509238: debian-installer: lenny installer (daily build) locks up after net hw detection screen
Package: debian-installer Version: lenny installer Severity: critical Justification: breaks the whole system -- System Information: Debian Release: 4.0 --- Not really. It's the lenny installer APT prefers stable APT policy: (500, 'stable') Architecture: i386 (i686) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.6.18 Locale: LANG=en_US, LC_CTYPE=en_US (charmap=ISO-8859-1) As the above system information shows, Etch works perfectly on this system, although it was not easy to install on a blank HD when the OS was woody. The installation that triggered this bug was attempted by building a bootable USB memory stick (boot.img.gz (downloaded today) + debian-testing-i386-netinst.iso) to install lenny on a new blank HD. Attempted work-arounds. As can be seen for the output of lswh (done from Etch and appended below), this laptop has lots of built-in hardware. In addition to the suggested temporary options (noapic, nolapic, acpi=off, irqpoll) and hw-detect/start_pcmcia=false, I have excluded all the i/o addresses that are likely to cause a problem: the pcmcia system, the firewire port, the wireless adapter (ipw2100), the AC97 sound system and the modem. Obviously, since this is a net install, the network card (that in Etch uses the b44 driver just fine) is mandatory. Immediately after the network hardware detection is apparently complete, the screen goes dark, the right 2 LEDs next to the power button flash at 1 second interval and the system totally freezes. Holding the power button for about 10 seconds is necessary to shut down the machine. I'd be glad to run specific tests if asked. Any other suggestions of work-arounds welcome. Output of lshw follows: dell-8600 description: Portable Computer product: Inspiron 8600 vendor: Dell Computer Corporation serial: xxx width: 32 bits capabilities: smbios-2.3 dmi-2.3 configuration: boot=normal chassis=portable uuid=xxxC---- *-core description: Motherboard product: 0X1069 vendor: Dell Computer Corporation physical id: 0 serial: .xxx.xx. *-firmware description: BIOS vendor: Dell Computer Corporation physical id: 0 version: A11 (10/25/2004) size: 64KB capacity: 448KB capabilities: isa pci pcmcia pnp apm upgrade shadowing cdboot bootselect int13floppy720 int5printscreen int9keyboard int14serial int17printer int10video acpi usb agp smartbattery biosbootspecification netboot *-cpu description: CPU product: Intel(R) Pentium(R) M processor 1400MHz vendor: Intel Corp. physical id: 400 bus info: c...@0 version: 6.9.5 slot: Microprocessor size: 1400MHz capacity: 1700MHz width: 32 bits clock: 133MHz capabilities: fpu fpu_exception wp vme de pse tsc msr mce cx8 sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 tm pbe est tm2 *-cache:0 description: L1 cache physical id: 700 size: 8KB capacity: 8KB capabilities: internal write-back data *-cache:1 description: L2 cache physical id: 701 size: 1MB capacity: 1MB clock: 66MHz (15.0ns) capabilities: pipeline-burst internal varies unified *-memory description: System Memory physical id: 1000 slot: System board or motherboard size: 512MB capacity: 1GB *-bank:0 description: DIMM DDR Synchronous 333 MHz (3.0 ns) physical id: 0 slot: DIMM_A size: 256MB width: 64 bits clock: 333MHz (3.0ns) *-bank:1 description: DIMM DDR Synchronous 333 MHz (3.0 ns) physical id: 1 slot: DIMM_B size: 256MB width: 64 bits clock: 333MHz (3.0ns) *-pci description: Host bridge product: 82855PM Processor to I/O Controller vendor: Intel Corporation physical id: e800 bus info: p...@00:00.0 version: 03 width: 32 bits clock: 33MHz resources: iomemory:e800-efff *-pci:0 description: PCI bridge product: 82855PM Processor to AGP Controller vendor: Intel Corporation physical id: 1 bus info: p...@00:01.0 version: 03 width: 32 bits clock: 66MHz capabilities: pci normal_decode bus_master *-display description: VGA compatible controller product: NV28 [GeForce4 Ti 4200 Go AGP 8x] vendor: nVidia Corporation physical id: 0 bus info: