Bug#701054: verbose kernel logs for thinkcentre m78: 3.2.0-4-686-pae, 3.7-trunk-686-pae, 3.8-trunk-686-pae

2013-03-11 Thread Daniel Kahn Gillmor
On Sun 2013-03-10 23:14:35 -0400, Daniel Kahn Gillmor wrote:

 On 03/10/2013 08:33 PM, Ben Hutchings wrote:
 You can try applying the quirk by adding 'acpi_osi=Linux' to the kernel
 command line.

 thanks, i'll give that a try when i'm in front of the machine tomorrow.

this changed the log message to the following (from a 3.2 boot), but the
behavior of the system remained the same, including the null
pointer dereference  in pulseaudio:
 
[0.588857] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query honored via cmdline


  * just 3.2:
 [1.676960] [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0)
 for vector 0x10400, but the register is already in use for vector 0xf9
 on another cpu
 [1.690776] [Firmware Bug]: cpu 0, IBS interrupt offset 0 not
 available (MSRC001103A=0x0100)
 [1.700255] Failed to setup IBS, -22

  * various hda-intel weirdnesses.
 
 Does it still crash when starting PulseAudio?

 kernel 3.2 (which has the above output) does have a null dereference
 within pulseaudio -- you can see it at 185.6 seconds into the first of
 the three boots.  neither 3.7 and 3.8 have this null dereference.

 using all three kernels, there is still a hang just before setting
 preliminary keymap, which is in /etc/rcS.d/S05keyboard-setup.  So it's
 presumably hanging in one of:

 S01mountkernfs.sh
 S02udev
 S03mountdevsubfs.sh
 S04bootlogd

 i'm not sure what userspace process is causing that hang (that is, which
 one is being terminated when i send ctrl-c through the console), but i
 can try to track it down.

Ah.  i discovered that if i just wait  3 minutes at that hang instead
of pressing ctrl+C impatiently, the boot process does continue, showing
the following error messages:

udevd[416]: worker [495] unexpectedly returned with status 0x0100
^M
udevd[416]: worker [495] failed while handling 
'/devices/pci:00/:00:15.2/:03:00.3'
^M
done.
Setting preliminary keymap...done.

So it is almost certainly udev that is failing in this way.


Again, with the 3.2 kernel, pulseaudio crashes, but it is slightly
different.  it's no longer a null pointer dereference.

(fwiw, uid 109 appears to be the Debian-gdm user) Here is the backtrace
From the crash after letting the boot process timeout with the
acpi_osi=Linux parameter:

[  201.019945] kernel tried to execute NX-protected page - exploit attempt? 
(uid: 109)
[  201.020006] BUG: unable to handle kernel paging request at f62b7940
[  201.020006] IP: [f62b7940] 0xf62b793f
[  201.020006] *pdpt = 01484001 *pde = 8000362001e3 
[  201.020006] Oops: 0011 [#1] SMP 
[  201.020006] Modules linked in: sha1_generic hmac cbc cts bridge stp bnep 
rfcomm bluetooth rfkill crc16 rpcsec_gss_krb5 uinput fuse nfsd nfs lockd 
fscache auth_rpcgss nfs_acl sunrpc loop snd_hda_codec_realtek 
snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_pcm powernow_k8 
snd_seq snd_timer snd_seq_device mperf crc32c_intel ipmi_si(+) i2c_piix4 
i2c_core tpm_tis aesni_intel cryptd snd aes_i586 tpm ipmi_msghandler processor 
psmouse soundcore snd_page_alloc aes_generic serio_raw evdev tpm_bios pcspkr 
thermal_sys ext3 jbd mbcache dm_mod microcode usb_storage sg usbhid hid sd_mod 
sr_mod cdrom crc_t10dif ohci_hcd ehci_hcd xhci_hcd ahci libahci libata r8169 
mii scsi_mod usbcore usb_common button [last unloaded: scsi_wait_scan]
[  201.052254] 
[  201.052254] Pid: 3299, comm: pulseaudio Not tainted 3.2.0-4-686-pae #1 
Debian 3.2.35-2 LENOVO 4865A14/Annapurna CRB
[  201.052254] EIP: 0060:[f62b7940] EFLAGS: 00010202 CPU: 3
[  201.052254] EIP is at 0xf62b7940
[  201.052254] EAX: f37f6e2c EBX: 0080 ECX: f365e4c0 EDX: f6ee5400
[  201.052254] ESI: f843b05c EDI: f692ef00 EBP: f6ee5a4c ESP: f6f71d98
[  201.052254]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[  201.052254] Process pulseaudio (pid: 3299, ti=f6f7 task=f2e0ea80 
task.ti=f6f7)
[  201.052254] Stack:
[  201.052254]  f8438515 0080 0080 f2ea5400 f365e4c0 f6deee00 f37f6e2c 
f692ef00
[  201.052254]  f6deee38 f6f71e04  f843c518 f6c31b00 f84e9bcb f365e4c0 
f367ea00
[  201.052254]  f2e0ea80 f84e9cbf f6f71e04 f367eb18 f367eb2c   
f2e0ea80
[  201.052254] Call Trace:
[  201.052254]  [f8438515] ? azx_pcm_open+0x171/0x1f7 [snd_hda_intel]
[  201.052254]  [f84e9bcb] ? snd_pcm_open_substream+0x36/0x68 [snd_pcm]
[  201.052254]  [f84e9cbf] ? snd_pcm_open+0xc2/0x1be [snd_pcm]
[  201.052254]  [c10320e5] ? try_to_wake_up+0x155/0x155
[  201.052254]  [f84e9e33] ? snd_pcm_playback_open+0x31/0x46 [snd_pcm]
[  201.052254]  [f83af49e] ? snd_open+0xf5/0x133 [snd]
[  201.052254]  [c10cf2d7] ? chrdev_open+0xf3/0x111
[  201.052254]  [c10cafb3] ? __dentry_open+0x17a/0x253
[  201.052254]  [c10cbd01] ? nameidata_to_filp+0x3a/0x45
[  201.052254]  [c10cf1e4] ? cdev_put+0x17/0x17
[  201.052254]  [c10d5dac] ? do_last+0x4f8/0x513
[  201.052254]  [c10d606e] ? path_openat+0xa1/0x28b
[  201.052254]  [c10d6301] ? do_filp_open+0x23/0x5c
[  201.052254]  [c102a0e5] ? should_resched+0x5/0x1e
[  

Bug#701054: verbose kernel logs for thinkcentre m78: 3.2.0-4-686-pae, 3.7-trunk-686-pae, 3.8-trunk-686-pae

2013-03-11 Thread Daniel Kahn Gillmor
On Mon 2013-03-11 15:40:06 -0400, Daniel Kahn Gillmor wrote:

 None of these warnings or backtraces show up on 3.8, and pulseaudio also
 does not crash on 3.8.

However, i should note that one of the udev threads does still hang/fail
on 3.8 for just under 180 seconds:

--
[8.730656] sd 7:0:0:2: [sdd] Attached SCSI removable disk
udevd[472]: worker [538] unexpectedly returned with status 0x0100

udevd[472]: worker [538] failed while handling 
'/devices/pci:00/:00:15.2/:03:00.3'

done.
Setting preliminary keymap...done.
Checking root file system...fsck from util-linux-ng 2.17.2
e2fsck 1.41.12 (17-May-2010)
/dev/mapper/krazy-root: clean, 134546/249984 fil[  186.797993] EXT3-fs (dm-0): 
using internal journal
es, 920069/999424 blocks (check in 4 mounts)
done.
Cleaning up ifupdown
[  186.935970] loop: module loaded
--


note that the device in question appears to be:

   *-serial
description: IPMI SMIC interface
product: Realtek Semiconductor Co., Ltd.
vendor: Realtek Semiconductor Co., Ltd.
physical id: 0.3
bus info: pci@:03:00.3
version: 01
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress msix vpd cap_list
configuration: driver=ipmi_si latency=0
resources: irq:17 ioport:e000(size=256) 
memory:fea1-fea100ff memory:fea0-fea03fff


should i be telling this machine to blacklist some module, or tell udev
to avoid trying to do anything with this device?

   --dkg


pgpyONLK8_Qkw.pgp
Description: PGP signature


Bug#701054: verbose kernel logs for thinkcentre m78: 3.2.0-4-686-pae, 3.7-trunk-686-pae, 3.8-trunk-686-pae

2013-03-10 Thread Ben Hutchings
On Wed, 2013-03-06 at 13:51 -0500, Daniel Kahn Gillmor wrote:
[...]
 Of particular interest between the three boots is:
 
  * all three boots: [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored

This means: the BIOS includes a quirk for Linux, but we ignored it
because there's no way to know which versions it was intended to apply
to.  (This was changed in Linux 2.6.23, so I have no idea why there are
new machines like this.)

You can try applying the quirk by adding 'acpi_osi=Linux' to the kernel
command line.

  * just 3.2:
 [1.676960] [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0)
 for vector 0x10400, but the register is already in use for vector 0xf9
 on another cpu
 [1.690776] [Firmware Bug]: cpu 0, IBS interrupt offset 0 not
 available (MSRC001103A=0x0100)
 [1.700255] Failed to setup IBS, -22
 
  * various hda-intel weirdnesses.

Does it still crash when starting PulseAudio?

I assume it still doesn't actually produce sound output.

 please let me know if there are other diagnostics we can run with any of
 these kernels, or if you'd like me to try anything different.
[...]

The standard diagnostic script for ALSA is:
http://www.alsa-project.org/alsa-info.sh

If sound is still broken on 3.8 then please run this script there.

Ben.

-- 
Ben Hutchings
The obvious mathematical breakthrough [to break modern encryption] would be
development of an easy way to factor large prime numbers. - Bill Gates


signature.asc
Description: This is a digitally signed message part


Bug#701054: verbose kernel logs for thinkcentre m78: 3.2.0-4-686-pae, 3.7-trunk-686-pae, 3.8-trunk-686-pae

2013-03-10 Thread Daniel Kahn Gillmor
On 03/10/2013 08:33 PM, Ben Hutchings wrote:
 This means: the BIOS includes a quirk for Linux, but we ignored it
 because there's no way to know which versions it was intended to apply
 to.  (This was changed in Linux 2.6.23, so I have no idea why there are
 new machines like this.)
 
 You can try applying the quirk by adding 'acpi_osi=Linux' to the kernel
 command line.

thanks, i'll give that a try when i'm in front of the machine tomorrow.

  * just 3.2:
 [1.676960] [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0)
 for vector 0x10400, but the register is already in use for vector 0xf9
 on another cpu
 [1.690776] [Firmware Bug]: cpu 0, IBS interrupt offset 0 not
 available (MSRC001103A=0x0100)
 [1.700255] Failed to setup IBS, -22

  * various hda-intel weirdnesses.
 
 Does it still crash when starting PulseAudio?

kernel 3.2 (which has the above output) does have a null dereference
within pulseaudio -- you can see it at 185.6 seconds into the first of
the three boots.  neither 3.7 and 3.8 have this null dereference.

using all three kernels, there is still a hang just before setting
preliminary keymap, which is in /etc/rcS.d/S05keyboard-setup.  So it's
presumably hanging in one of:

S01mountkernfs.sh
S02udev
S03mountdevsubfs.sh
S04bootlogd

i'm not sure what userspace process is causing that hang (that is, which
one is being terminated when i send ctrl-c through the console), but i
can try to track it down.

 I assume it still doesn't actually produce sound output.

I haven't yet been able to coax any sound out of the system.

 The standard diagnostic script for ALSA is:
 http://www.alsa-project.org/alsa-info.sh
 
 If sound is still broken on 3.8 then please run this script there.

Will do.

Regards,

--dkg



signature.asc
Description: OpenPGP digital signature