I have a very cool hard drive for beta testing : sometimes it completely
hangs.
When in this state, previous kernels tried to read from it about 3 times every
minute (more or less) and were displaying 'hdc : lost interrupt' in syslog.
Then, they attempted to reset the drive and there, the drive was spining up
again and resumed normal operation (well... the drive, not my applications,
but that's another story).
It looks like this behaviour has changed with 2.6.14 (I'm using -2mdk) : the
kernel doesn't reset the drive and keep on displaying the 'hdc : lost
interrupt' message in syslog. So I decided to reset the drive myself and
there, I got a oops :
# hdparm -w /dev/hdc
/dev/hdc:
------------[ cut here ]------------
kernel BUG at <bad filename>:16583!
invalid operand: 0000 [#1]
Modules linked in: appletalk ipx ipt_IFWLOG ipt_psd ip_set_iptree iptable_raw
ipt_ipp2p iptable_mangle ipt_ULOG ipt_TTL ipt_ttl ipt_TOS ipt_tos ipt_TCPMSS
ipt_tcpmss ipt_state ipt_set ipt_sctp ipt_SAME ipt_REJECT ipt_REDIRECT
ipt_recent ipt_realm ipt_pkttype nvidia ipt_physdev ipt_owner ipt_NOTRACK
ipt_NETMAP ipt_multiport ipt_MASQUERADE ipt_MARK ipt_mark ipt_mac ipt_LOG
ipt_limit ipt_length ipt_iprange ipt_helper ipt_hashlimit ipt_esp ipt_ECN
ipt_ecn ipt_DSCP ipt_dscp ipt_conntrack ipt_CONNMARK ipt_connmark ipt_comment
ipt_CLUSTERIP ipt_CLASSIFY ipt_ah ipt_addrtype ip_set_portmap ip_set_macipmap
ip_set_ipmap nfsd ip_set_iphash ip_set exportfs ip_nat_irc lockd ip_nat_tftp
nfs_acl ip_nat_ftp
sunrpc iptable_nat ip_nat ip_conntrack_irc ip_conntrack_tftp ip_conntrack_ftp
ip_conntrack nfnetlink iptable_filter ip_tables ipv6 via_rhine mii ne2k_pci
8390 snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq af_packet
snd_pcm_oss snd_mixer_oss snd_emu10k1 snd_rawmidi snd_ac97_codec snd_ac97_bus
snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem snd_hwdep video
thermal snd soundcore processor fan container button battery ac ide_cd
binfmt_misc loop nls_iso8859_15 nls_cp850 vfat fat reiserfs via_agp agpgart
spca5xx videodev usbmouse ehci_hcd usbhid uhci_hcd usbcore ext3 jbd
CPU: 0
EIP: 0060:[<c02a04a6>] Tainted: P VLI
EFLAGS: 00210086 (2.6.14-2mdk)
EIP is at generic_ide_ioctl+0x5f6/0x7c0
eax: eff02380 ebx: 00200206 ecx: 00000010 edx: c02a8130
esi: 0000007c edi: d4887da8 ebp: d4887e40 esp: d4887d58
ds: 007b es: 007b ss: 0068
Process hdparm (pid: 10056, threadinfo=d4886000 task=d924c570)
Stack: c048ee04 c03756ca 00000004 00000000 00000016 00000000 00000011 00010001
0000007c 0000031c 0000000f 0000007c 0000031c 00000000 00000240 00000008
00000010 00000007 00000000 00000001 00000011 d4886000 00002748 00000004
Call Trace:
[<c0103e7b>] show_stack+0xab/0xf0
[<c0104042>] show_registers+0x162/0x200
[<c0104258>] die+0xc8/0x160
[<c0340d79>] do_trap+0x89/0xd0
[<c010460a>] do_invalid_op+0xaa/0xc0
[<c0103b13>] error_code+0x4f/0x54
[<c02aeda9>] idedisk_ioctl+0x39/0x40
[<c027e689>] blkdev_driver_ioctl+0x59/0x60
[<c027e805>] blkdev_ioctl+0x175/0x1a0
[<c01a4acb>] block_ioctl+0x2b/0x30
[<c01b1379>] do_ioctl+0x69/0x1a0
[<c01b160f>] vfs_ioctl+0x5f/0x1d0
[<c01b17c1>] sys_ioctl+0x41/0x70
[<c0102fff>] sysenter_past_esp+0x54/0x75
Code: 0c 00 01 00 00 9c 5b fa b9 ca 56 37 c0 89 4c 24 04 8b 45 08 89 04 24 e8
19 11 00 00 8b 55 08 8b 42 6c 8b 40 08 8b 10 85 d2 74 02 <0f> 0b c7 40 08 01
00 00 00 53 9d 8b 4d 08 89 0c 24 e8 a4 40 00
Erreur de segmentation
So I didn't succeed to recover the normal state of my drive and had to reboot
the hard way (with magic keys).
Is it helpful for you ? Do you need more info ? I'd like to avoid it, but
should I report the oops on the lkml ? just ask.
Vincent.
PS : I don't know on which list I should discuss this issue, so I post on both
of them.