Bug#631187: Kernel panics when removing external hard drive
found 631187 3.0.0-1 thanks Hi, FYI: The problem still occurs with Linux 3.0.0-1. Best regards Alexander Kurtz signature.asc Description: This is a digitally signed message part
Bug#631187: Kernel panics when removing external hard drive
On Wed, 2011-07-13 at 22:03 +1000, Linh Nguyen wrote: > Hello Alexander, > > How are you? I came across your post > http://lists.debian.org/debian-kernel/2011/06/msg00580.html detailing > similar issue as to what I am experiencing. > > Every time I unmount a portable HDD (normal USB sticks are fine), i get > a kernel panic the the "power/level is deprecated; use power/control > instead" error message. > > Despite my extensive googling, i've not been able to find a solution. I > was wondering whether or not you have solved your issue. Cheers. :) > > > Sincerely, > > L Sorry, I've got no solution either. Since this is kind of a low-priority bug for me, I'm fine with manually unmounting (using umount or some GUI) my external drive before removing it. My current plan is to wait for 3.0 and then maybe do a git bisect if it's not fixed by then. However, you should check out the Debian bug report[1], the Ubuntu bug report[2] and the upstream bug report[3], maybe you'll find something there. Best regards Alexander Kurtz [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=631187 [2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/793796 [3] https://bugzilla.kernel.org/show_bug.cgi?id=38842 signature.asc Description: This is a digitally signed message part
Bug#631187: Kernel panics when removing external hard drive
forwarded 631187 https://bugzilla.kernel.org/show_bug.cgi?id=38842 quit Hi, Alexander Kurtz wrote: > I just tested 2.6.39-3 from sid and 3.0.0~rc6-1~experimental.1 from > experimental. Unfortunately both reliably panic when safely removing my > external hard drive. 2.6.38-5 (still) works fine. Seems like it's time > for me to do a git bisect, or do you any other ideas? I'd suggest attaching the full dmesg from 3.0.0~rc6 and any other relevant information to https://bugzilla.kernel.org/show_bug.cgi?id=38842 first. Maybe someone upstream will have ideas. Thanks again. Jonathan -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#631187: Kernel panics when removing external hard drive
On Fri, 2011-07-08 at 04:30 +0100, Ben Hutchings wrote: > Alexander, please test the new package version. I just tested 2.6.39-3 from sid and 3.0.0~rc6-1~experimental.1 from experimental. Unfortunately both reliably panic when safely removing my external hard drive. 2.6.38-5 (still) works fine. Seems like it's time for me to do a git bisect, or do you any other ideas? Best regards Alexander Kurtz signature.asc Description: This is a digitally signed message part
Bug#631187: Kernel panics when removing external hard drive
Hi Ben, Ben Hutchings wrote: > There is a byte missing between the two lines (in fact, the very byte > which RIP points to), and you are mixing decimal and hexadecimal > offsets. > > In fact RIP is pointing into the second half of this test: > > if ((rq->cmd_flags & REQ_SORTED) && > e->ops->elevator_completed_req_fn) > > and e->ops was NULL. Ah, that makes sense. > This might be fixed by: > > commit 0769e21bf4b5cf48878c1ca819276e80465b39e7 > Author: James Bottomley > Date: Wed May 25 15:52:14 2011 -0500 > > Fix oops caused by queue refcounting failure > > commit e73e079bf128d68284efedeba1fbbc18d78610f9 upstream. As does that. Thanks for explaining. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#631187: Kernel panics when removing external hard drive
On Tue, 2011-07-05 at 17:51 -0500, Jonathan Nieder wrote: > Hi, > > Alexander Kurtz wrote: [...] > > [ 1491.696825] Code: 40 74 35 83 7e 44 01 74 04 a8 40 74 2b 83 e0 11 ff c8 > > 0f 95 c0 83 e0 01 48 05 fc 00 00 00 ff 4c 87 04 f6 46 41 04 74 10 48 8b 02 > > [ 1491.696825] 8b 40 48 48 85 c0 74 04 41 58 ff e0 59 c3 48 8d be 80 00 00 > > [ 1491.696825] RIP [] elv_completed_request+0x38/0x47 > > Disassembly, for convenience (following the hints from > Documentation/oops-tracing.txt): [...] There is a byte missing between the two lines (in fact, the very byte which RIP points to), and you are mixing decimal and hexadecimal offsets. In fact RIP is pointing into the second half of this test: if ((rq->cmd_flags & REQ_SORTED) && e->ops->elevator_completed_req_fn) and e->ops was NULL. This might be fixed by: commit 0769e21bf4b5cf48878c1ca819276e80465b39e7 Author: James Bottomley Date: Wed May 25 15:52:14 2011 -0500 Fix oops caused by queue refcounting failure commit e73e079bf128d68284efedeba1fbbc18d78610f9 upstream. which was included in stable version 2.6.39.2 and our package version 2.6.39-3. Alexander, please test the new package version. Ben. -- Ben Hutchings The two most common things in the universe are hydrogen and stupidity. signature.asc Description: This is a digitally signed message part
Bug#631187: Kernel panics when removing external hard drive
Hi, Alexander Kurtz wrote: > On Wed, 2011-06-22 at 03:40 +0100, Ben Hutchings wrote: >> The panic message shows there was an earlier kernel warning; please can >> you provide that. > > Thanks to netconsole (a really great tool!) I was able to so. The > attached kernel log starts right before I plug the drive in. > Surprisingly the kernel didn't crash the first time, but after trying > again, everything went as expected (see lines 17 and 35). Sorry for the long silence. Let's see: > [ 1421.182657] sd 7:0:0:0: [sdc] Attached SCSI disk > [ 1454.865926] WARNING! power/level is deprecated; use power/control instead Seems harmless enough. > [ 1478.728383] sd 8:0:0:0: [sdc] Attached SCSI disk > [ 1491.693027] BUG: unable to handle kernel NULL pointer dereference at > 0048 > [ 1491.693229] IP: [] elv_completed_request+0x38/0x47 The panic. [...] > [ 1491.696825] Code: 40 74 35 83 7e 44 01 74 04 a8 40 74 2b 83 e0 11 ff c8 0f > 95 c0 83 e0 01 48 05 fc 00 00 00 ff 4c 87 04 f6 46 41 04 74 10 48 8b 02 > [ 1491.696825] 8b 40 48 48 85 c0 74 04 41 58 ff e0 59 c3 48 8d be 80 00 00 > [ 1491.696825] RIP [] elv_completed_request+0x38/0x47 Disassembly, for convenience (following the hints from Documentation/oops-tracing.txt): | <+0>: rex je 0x6008b8 | <+3>: cmpl $0x1,0x44(%rsi) | <+7>: je 0x60088d | <+9>: test $0x40,%al | <+11>:je 0x6008b8 | <+13>:and$0x11,%eax | <+16>:dec%eax | <+18>:setne %al | <+21>:and$0x1,%eax | <+24>:add$0xfc,%rax | <+30>:decl 0x4(%rdi,%rax,4) | <+34>:testb $0x4,0x41(%rsi) | <+38>:je 0x6008b8 | <+40>:mov(%rdx),%rax | <+43>:cmp%ah,0x40(%rdx) | <+46>:rex.W | <+47>:test %rax,%rax | <+50>:je 0x6008b8 | <+52>:pop%r8 | <+54>:jmpq *%rax | <+56>:pop%rcx | <+57>:retq | <+58>:lea0x80(%rsi),%rdi So offset 0x38 is the jump in if ((rq->cmd_flags & REQ_SORTED) && As for why that involves an access to the address 0x48: well, that is beyond my depth. rq->cmd_flags was already accessed in the check if (blk_account_rq(rq)) Maybe the actual cause of the fault is some different instruction and the instruction pointer is not to be trusted (?). I suppose if I were in this situation, I'd sprinkle block/elevator.c::elv_completed_request with printk calls to be able to witness exactly what happens. Sorry for the trouble, and hope that helps. Jonathan -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#631187: Kernel panics when removing external hard drive
On Wed, 2011-06-22 at 03:40 +0100, Ben Hutchings wrote: > Which version of GNOME is this? 2.30.2. Apart from the newer kernel, this is a pure Squeeze system. > The panic message shows there was an earlier kernel warning; please can > you provide that. Thanks to netconsole (a really great tool!) I was able to so. The attached kernel log starts right before I plug the drive in. Surprisingly the kernel didn't crash the first time, but after trying again, everything went as expected (see lines 17 and 35). Please note that I replaced the drive's serial number. Best regards Alexander Kurtz [ 1420.016231] usb 1-3: new high speed USB device number 6 using ehci_hcd [ 1420.150838] usb 1-3: New USB device found, idVendor=1058, idProduct=1010 [ 1420.150867] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 1420.150891] usb 1-3: Product: External HDD [ 1420.150900] usb 1-3: Manufacturer: Western Digital [ 1420.150914] usb 1-3: SerialNumber: XX [ 1420.152513] scsi7 : usb-storage 1-3:1.0 [ 1421.154225] scsi 7:0:0:0: Direct-Access WD 2500BEV External 1.75 PQ: 0 ANSI: 4 [ 1421.158259] sd 7:0:0:0: [sdc] 488397168 512-byte logical blocks: (250 GB/232 GiB) [ 1421.159053] sd 7:0:0:0: [sdc] Write Protect is off [ 1421.159069] sd 7:0:0:0: [sdc] Mode Sense: 23 00 00 00 [ 1421.159080] sd 7:0:0:0: [sdc] Assuming drive cache: write through [ 1421.161796] sd 7:0:0:0: [sdc] Assuming drive cache: write through [ 1421.179973] sdc: sdc1 [ 1421.182628] sd 7:0:0:0: [sdc] Assuming drive cache: write through [ 1421.182657] sd 7:0:0:0: [sdc] Attached SCSI disk [ 1454.865926] WARNING! power/level is deprecated; use power/control instead [ 1454.944178] usb 1-3: USB disconnect, device number 6 [ 1477.564219] usb 1-2: new high speed USB device number 7 using ehci_hcd [ 1477.698789] usb 1-2: New USB device found, idVendor=1058, idProduct=1010 [ 1477.698817] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 1477.698841] usb 1-2: Product: External HDD [ 1477.698850] usb 1-2: Manufacturer: Western Digital [ 1477.698867] usb 1-2: SerialNumber: XX [ 1477.700552] scsi8 : usb-storage 1-2:1.0 [ 1478.702244] scsi 8:0:0:0: Direct-Access WD 2500BEV External 1.75 PQ: 0 ANSI: 4 [ 1478.705375] sd 8:0:0:0: [sdc] 488397168 512-byte logical blocks: (250 GB/232 GiB) [ 1478.705994] sd 8:0:0:0: [sdc] Write Protect is off [ 1478.706023] sd 8:0:0:0: [sdc] Mode Sense: 23 00 00 00 [ 1478.706035] sd 8:0:0:0: [sdc] Assuming drive cache: write through [ 1478.708338] sd 8:0:0:0: [sdc] Assuming drive cache: write through [ 1478.725489] sdc: sdc1 [ 1478.728353] sd 8:0:0:0: [sdc] Assuming drive cache: write through [ 1478.728383] sd 8:0:0:0: [sdc] Attached SCSI disk [ 1491.693027] BUG: unable to handle kernel NULL pointer dereference at 0048 [ 1491.693229] IP: [] elv_completed_request+0x38/0x47 [ 1491.693380] PGD 1b7f16067 [ 1491.693435] Buffer I/O error on device sdc1, logical block 61048968 [ 1491.693448] Buffer I/O error on device sdc1, logical block 61048968 [ 1491.693486] Buffer I/O error on device sdc1, logical block 61048992 [ 1491.693494] Buffer I/O error on device sdc1, logical block 61048992 [ 1491.693510] Buffer I/O error on device sdc1, logical block 61048998 [ 1491.693517] Buffer I/O error on device sdc1, logical block 61048998 [ 1491.693554] Buffer I/O error on device sdc1, logical block 61048999 [ 1491.693567] Buffer I/O error on device sdc1, logical block 0 [ 1491.693578] Buffer I/O error on device sdc1, logical block 0 [ 1491.693590] Buffer I/O error on device sdc1, logical block 256 [ 1491.694599] PUD 1b7f23067 PMD 0 [ 1491.694689] Oops: [#1] SMP [ 1491.694777] last sysfs file: /sys/devices/pci:00/:00:12.2/usb1/1-2/power/autosuspend [ 1491.694945] CPU 1 [ 1491.694991] Modules linked in: netconsole configfs parport_pc ppdev lp parport bridge stp bnep rfcomm bluetooth powernow_k8 mperf cpufreq_stats cpufreq_userspace cpufreq_powersave cpufreq_conservative binfmt_misc fuse snd_hda_codec_hdmi joydev snd_hda_codec_conexant radeon arc4 ecb ttm drm_kms_helper snd_hda_intel thinkpad_acpi rtl8192ce drm snd_hda_codec rtl8192c_common snd_hwdep i2c_algo_bit rtlwifi snd_pcm snd_seq snd_seq_device mac80211 snd_timer snd cfg80211 i2c_piix4 shpchp soundcore tpm_tis tpm psmouse tpm_bios snd_page_alloc wmi nvram k10temp rfkill pcspkr pci_hotplug i2c_core evdev serio_raw battery video ac edac_core power_supply edac_mce_amd button processor ext4 mbcache jbd2 crc16 sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod raid10 raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid1 raid0 multipath linear md_mod sd_mod usb_storage crc_t10dif uas ahci libahci ohci_hcd libata ehci_hcd r8169 thermal scsi_mod usbcore mii thermal_sys [last unloaded: configfs] [ 1491.696825] [ 1491.696825] Pid: 10, comm: ksoftirqd/1 Tainted: GW2.6.39-2-amd64 #1 LENOVO 0221A16/02
Bug#631187: Kernel panics when removing external hard drive
On Tue, 2011-06-21 at 11:08 +0200, Alexander Kurtz wrote: > Package: linux-2.6 > Version: 2.6.39-1 > Severity: serious > > Hi, > > I've got a pretty normal Debian Squeeze AMD64 system with the current > kernel from Wheezy. Since 2.6.39-1 I experience this bug: > > 1. I plug in an external USB hard drive with a NTFS file system on > it's first partition. > 2. The drive get's automatically mounted using the fuse-based NTFS > driver (ntfs-3g). > 3. I right-click on the icon representing the drive on the GNOME > desktop and select "Safely Remove Drive". Which version of GNOME is this? > 4. The kernel panics, see attached screenshot. [...] The panic message shows there was an earlier kernel warning; please can you provide that. Ben. -- Ben Hutchings I'm always amazed by the number of people who take up solipsism because they heard someone else explain it. - E*Borg on alt.fan.pratchett signature.asc Description: This is a digitally signed message part