Bug#701054: verbose kernel logs for thinkcentre m78: 3.2.0-4-686-pae, 3.7-trunk-686-pae, 3.8-trunk-686-pae
On Sun 2013-03-10 23:14:35 -0400, Daniel Kahn Gillmor wrote: On 03/10/2013 08:33 PM, Ben Hutchings wrote: You can try applying the quirk by adding 'acpi_osi=Linux' to the kernel command line. thanks, i'll give that a try when i'm in front of the machine tomorrow. this changed the log message to the following (from a 3.2 boot), but the behavior of the system remained the same, including the null pointer dereference in pulseaudio: [0.588857] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline * just 3.2: [1.676960] [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu [1.690776] [Firmware Bug]: cpu 0, IBS interrupt offset 0 not available (MSRC001103A=0x0100) [1.700255] Failed to setup IBS, -22 * various hda-intel weirdnesses. Does it still crash when starting PulseAudio? kernel 3.2 (which has the above output) does have a null dereference within pulseaudio -- you can see it at 185.6 seconds into the first of the three boots. neither 3.7 and 3.8 have this null dereference. using all three kernels, there is still a hang just before setting preliminary keymap, which is in /etc/rcS.d/S05keyboard-setup. So it's presumably hanging in one of: S01mountkernfs.sh S02udev S03mountdevsubfs.sh S04bootlogd i'm not sure what userspace process is causing that hang (that is, which one is being terminated when i send ctrl-c through the console), but i can try to track it down. Ah. i discovered that if i just wait 3 minutes at that hang instead of pressing ctrl+C impatiently, the boot process does continue, showing the following error messages: udevd[416]: worker [495] unexpectedly returned with status 0x0100 ^M udevd[416]: worker [495] failed while handling '/devices/pci:00/:00:15.2/:03:00.3' ^M done. Setting preliminary keymap...done. So it is almost certainly udev that is failing in this way. Again, with the 3.2 kernel, pulseaudio crashes, but it is slightly different. it's no longer a null pointer dereference. (fwiw, uid 109 appears to be the Debian-gdm user) Here is the backtrace From the crash after letting the boot process timeout with the acpi_osi=Linux parameter: [ 201.019945] kernel tried to execute NX-protected page - exploit attempt? (uid: 109) [ 201.020006] BUG: unable to handle kernel paging request at f62b7940 [ 201.020006] IP: [f62b7940] 0xf62b793f [ 201.020006] *pdpt = 01484001 *pde = 8000362001e3 [ 201.020006] Oops: 0011 [#1] SMP [ 201.020006] Modules linked in: sha1_generic hmac cbc cts bridge stp bnep rfcomm bluetooth rfkill crc16 rpcsec_gss_krb5 uinput fuse nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc loop snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm powernow_k8 snd_seq snd_timer snd_seq_device mperf crc32c_intel ipmi_si(+) i2c_piix4 i2c_core tpm_tis aesni_intel cryptd snd aes_i586 tpm ipmi_msghandler processor psmouse soundcore snd_page_alloc aes_generic serio_raw evdev tpm_bios pcspkr thermal_sys ext3 jbd mbcache dm_mod microcode usb_storage sg usbhid hid sd_mod sr_mod cdrom crc_t10dif ohci_hcd ehci_hcd xhci_hcd ahci libahci libata r8169 mii scsi_mod usbcore usb_common button [last unloaded: scsi_wait_scan] [ 201.052254] [ 201.052254] Pid: 3299, comm: pulseaudio Not tainted 3.2.0-4-686-pae #1 Debian 3.2.35-2 LENOVO 4865A14/Annapurna CRB [ 201.052254] EIP: 0060:[f62b7940] EFLAGS: 00010202 CPU: 3 [ 201.052254] EIP is at 0xf62b7940 [ 201.052254] EAX: f37f6e2c EBX: 0080 ECX: f365e4c0 EDX: f6ee5400 [ 201.052254] ESI: f843b05c EDI: f692ef00 EBP: f6ee5a4c ESP: f6f71d98 [ 201.052254] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 201.052254] Process pulseaudio (pid: 3299, ti=f6f7 task=f2e0ea80 task.ti=f6f7) [ 201.052254] Stack: [ 201.052254] f8438515 0080 0080 f2ea5400 f365e4c0 f6deee00 f37f6e2c f692ef00 [ 201.052254] f6deee38 f6f71e04 f843c518 f6c31b00 f84e9bcb f365e4c0 f367ea00 [ 201.052254] f2e0ea80 f84e9cbf f6f71e04 f367eb18 f367eb2c f2e0ea80 [ 201.052254] Call Trace: [ 201.052254] [f8438515] ? azx_pcm_open+0x171/0x1f7 [snd_hda_intel] [ 201.052254] [f84e9bcb] ? snd_pcm_open_substream+0x36/0x68 [snd_pcm] [ 201.052254] [f84e9cbf] ? snd_pcm_open+0xc2/0x1be [snd_pcm] [ 201.052254] [c10320e5] ? try_to_wake_up+0x155/0x155 [ 201.052254] [f84e9e33] ? snd_pcm_playback_open+0x31/0x46 [snd_pcm] [ 201.052254] [f83af49e] ? snd_open+0xf5/0x133 [snd] [ 201.052254] [c10cf2d7] ? chrdev_open+0xf3/0x111 [ 201.052254] [c10cafb3] ? __dentry_open+0x17a/0x253 [ 201.052254] [c10cbd01] ? nameidata_to_filp+0x3a/0x45 [ 201.052254] [c10cf1e4] ? cdev_put+0x17/0x17 [ 201.052254] [c10d5dac] ? do_last+0x4f8/0x513 [ 201.052254] [c10d606e] ? path_openat+0xa1/0x28b [ 201.052254] [c10d6301] ? do_filp_open+0x23/0x5c [ 201.052254] [c102a0e5] ? should_resched+0x5/0x1e [
Bug#701054: verbose kernel logs for thinkcentre m78: 3.2.0-4-686-pae, 3.7-trunk-686-pae, 3.8-trunk-686-pae
On Mon 2013-03-11 15:40:06 -0400, Daniel Kahn Gillmor wrote: None of these warnings or backtraces show up on 3.8, and pulseaudio also does not crash on 3.8. However, i should note that one of the udev threads does still hang/fail on 3.8 for just under 180 seconds: -- [8.730656] sd 7:0:0:2: [sdd] Attached SCSI removable disk udevd[472]: worker [538] unexpectedly returned with status 0x0100 udevd[472]: worker [538] failed while handling '/devices/pci:00/:00:15.2/:03:00.3' done. Setting preliminary keymap...done. Checking root file system...fsck from util-linux-ng 2.17.2 e2fsck 1.41.12 (17-May-2010) /dev/mapper/krazy-root: clean, 134546/249984 fil[ 186.797993] EXT3-fs (dm-0): using internal journal es, 920069/999424 blocks (check in 4 mounts) done. Cleaning up ifupdown [ 186.935970] loop: module loaded -- note that the device in question appears to be: *-serial description: IPMI SMIC interface product: Realtek Semiconductor Co., Ltd. vendor: Realtek Semiconductor Co., Ltd. physical id: 0.3 bus info: pci@:03:00.3 version: 01 width: 64 bits clock: 33MHz capabilities: pm msi pciexpress msix vpd cap_list configuration: driver=ipmi_si latency=0 resources: irq:17 ioport:e000(size=256) memory:fea1-fea100ff memory:fea0-fea03fff should i be telling this machine to blacklist some module, or tell udev to avoid trying to do anything with this device? --dkg pgpyONLK8_Qkw.pgp Description: PGP signature
Bug#701054: verbose kernel logs for thinkcentre m78: 3.2.0-4-686-pae, 3.7-trunk-686-pae, 3.8-trunk-686-pae
On Wed, 2013-03-06 at 13:51 -0500, Daniel Kahn Gillmor wrote: [...] Of particular interest between the three boots is: * all three boots: [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored This means: the BIOS includes a quirk for Linux, but we ignored it because there's no way to know which versions it was intended to apply to. (This was changed in Linux 2.6.23, so I have no idea why there are new machines like this.) You can try applying the quirk by adding 'acpi_osi=Linux' to the kernel command line. * just 3.2: [1.676960] [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu [1.690776] [Firmware Bug]: cpu 0, IBS interrupt offset 0 not available (MSRC001103A=0x0100) [1.700255] Failed to setup IBS, -22 * various hda-intel weirdnesses. Does it still crash when starting PulseAudio? I assume it still doesn't actually produce sound output. please let me know if there are other diagnostics we can run with any of these kernels, or if you'd like me to try anything different. [...] The standard diagnostic script for ALSA is: http://www.alsa-project.org/alsa-info.sh If sound is still broken on 3.8 then please run this script there. Ben. -- Ben Hutchings The obvious mathematical breakthrough [to break modern encryption] would be development of an easy way to factor large prime numbers. - Bill Gates signature.asc Description: This is a digitally signed message part
Bug#701054: verbose kernel logs for thinkcentre m78: 3.2.0-4-686-pae, 3.7-trunk-686-pae, 3.8-trunk-686-pae
On 03/10/2013 08:33 PM, Ben Hutchings wrote: This means: the BIOS includes a quirk for Linux, but we ignored it because there's no way to know which versions it was intended to apply to. (This was changed in Linux 2.6.23, so I have no idea why there are new machines like this.) You can try applying the quirk by adding 'acpi_osi=Linux' to the kernel command line. thanks, i'll give that a try when i'm in front of the machine tomorrow. * just 3.2: [1.676960] [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0x10400, but the register is already in use for vector 0xf9 on another cpu [1.690776] [Firmware Bug]: cpu 0, IBS interrupt offset 0 not available (MSRC001103A=0x0100) [1.700255] Failed to setup IBS, -22 * various hda-intel weirdnesses. Does it still crash when starting PulseAudio? kernel 3.2 (which has the above output) does have a null dereference within pulseaudio -- you can see it at 185.6 seconds into the first of the three boots. neither 3.7 and 3.8 have this null dereference. using all three kernels, there is still a hang just before setting preliminary keymap, which is in /etc/rcS.d/S05keyboard-setup. So it's presumably hanging in one of: S01mountkernfs.sh S02udev S03mountdevsubfs.sh S04bootlogd i'm not sure what userspace process is causing that hang (that is, which one is being terminated when i send ctrl-c through the console), but i can try to track it down. I assume it still doesn't actually produce sound output. I haven't yet been able to coax any sound out of the system. The standard diagnostic script for ALSA is: http://www.alsa-project.org/alsa-info.sh If sound is still broken on 3.8 then please run this script there. Will do. Regards, --dkg signature.asc Description: OpenPGP digital signature