Hello,
As one of usual tests I run the following script:
for i in `find /proc -type f`; do
echo -n cat $i /dev/null ... ;
cat $i /dev/null;
echo done;
done
This time the culprit is /proc/net/packet. cat process gets killed
$ cat /proc/net/packet
Segmentation fault
and lost in lots of messages from the script but for some reason there is no
info in syslog (why?). I could capture the oops only when issued sysrq-7
or grater. That's why I didn't catch the oops earlier.
I found it because the bug makes my sparc64 box need a hardware reset most of
the
time it happens and produces oops 2 screens long. x86 kills the cat process but
system is still usable and running fine. Bisection points to:
git-ubi.patch
GOOD
#
git-net.patch
BAD
ipsec-fix-reversed-icmp6-policy-check.patch
but this seems to be far from precise :)
$ grep ^commit git-net.patch | wc -l
361
Not sure if this is important but when bisecting the mm tree the oops got
shorter
at some point so maybe some other patch is also involved. This one is from x86:
[ 194.508398] BUG: unable to handle kernel paging request at virtual address
bd47
[ 194.508412] printing eip: c0135d59 *pde =
[ 194.508419] Oops: [#1] PREEMPT
[ 194.508424] last sysfs file:
/devices/pci:00/:00:01.0/:01:05.0/resource
[ 194.508428] Modules linked in: usbhid hid orinoco_cs orinoco hermes pcmcia
firmware_class uhci_hcd ehci_hcd usbcore psmouse yenta_socket rsrc_nonstatic
rtc 8139too
[ 194.508443]
[ 194.508447] Pid: 5368, comm: cat Not tainted (2.6.24-rc5 #9)
[ 194.508450] EIP: 0060:[c0135d59] EFLAGS: 00210046 CPU: 0
[ 194.508466] EIP is at __lock_acquire+0x5b/0xfc4
[ 194.508469] EAX: 0022 EBX: 00200246 ECX: bd43 EDX: 0002
[ 194.508472] ESI: bd43 EDI: EBP: d816ce80 ESP: d816ce14
[ 194.508475] DS: 007b ES: 007b FS: GS: 0033 SS: 0068
[ 194.508479] Process cat (pid: 5368, ti=d816c000 task=d826a000
task.ti=d816c000)
[ 194.508481] Stack: c0135a21 d826a000 d816ce38 c0135697
d826a000 c0146ded
[ 194.508490]c1304f98 0002 bd43 0001 d826a000
d816cec0 c013681d
[ 194.508498]0006 0003 c03daa08 0001 0044 02ad
0005
[ 194.508506] Call Trace:
[ 194.508508] [c01035d8] show_trace_log_lvl+0x1a/0x30
[ 194.508518] [c0103693] show_stack_log_lvl+0xa5/0xca
[ 194.508523] [c0103787] show_registers+0xcf/0x23f
[ 194.508528] [c0103a04] die+0x10d/0x1f5
[ 194.508532] [c0110cee] do_page_fault+0x27e/0x5f0
[ 194.508540] [c034684a] error_code+0x6a/0x70
[ 194.508550] [c0136d20] lock_acquire+0x5e/0x76
[ 194.508555] [c03461a6] _read_lock+0x35/0x42
[ 194.508560] [c02d957a] sock_i_ino+0x14/0x30
[ 194.508568] [c032c7e8] packet_seq_show+0x19/0xa0
[ 194.508576] [c0179f5c] seq_read+0x19a/0x29e
[ 194.508583] [c0191b25] proc_reg_read+0x57/0x78
[ 194.508590] [c0161c8a] vfs_read+0x89/0x11d
[ 194.508596] [c0162054] sys_read+0x3d/0x64
[ 194.508600] [c010261a] sysenter_past_esp+0x5f/0xa5
[ 194.508605] ===
[ 194.508607] Code: c0 85 c0 0f 84 64 03 00 00 9c 58 f6 c4 02 0f 85 b8 07 00
00 83 ff 07 0f 87 de 07 00 00 85 ff 8d 76 00 0f 85 4f 03 00 00 8b 4d c0 8b 71
04 85 f6 0f 84 41 03 00 00 89 f0 e8 d8 d7 ff ff 85 c0 0f
[ 194.508651] EIP: [c0135d59] __lock_acquire+0x5b/0xfc4 SS:ESP 0068:d816ce14
[ 194.508660] note: cat[5368] exited with preempt_count 2
.config attached.
Regards,
Mariusz
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.24-rc5
# Sun Dec 16 00:22:27 2007
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
# CONFIG_X86_64 is not set
CONFIG_X86=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
# CONFIG_RWSEM_GENERIC_SPINLOCK is not set
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
# CONFIG_GENERIC_TIME_VSYSCALL is not set
CONFIG_ARCH_SUPPORTS_OPROFILE=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
# CONFIG_ZONE_DMA32 is not set
CONFIG_ARCH_POPULATES_NODE_MAP=y
# CONFIG_AUDIT_ARCH is not set
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_KTIME_SCALAR=y
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config
#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
#