On Wed, 16 Apr 2014, Antonio Dupont wrote:

I've installed  kernel-lt-3.2.57-1.el5.elrepo.i686.rpm on my CentOS 5.5
system because I'm trying to troubleshoot a third party USB hardware issue
that causes the system to hang.  If I issue the commands:

echo 1 > /proc/sys/kernel/sysrq
echo "c" > /proc/sysrq-trigger

The system looks like it's starting to capture the dump information, but
complains about many kernel-module version mismatches and the kernel that
it says it's mismatching is cryptic.  For example:

Mounting sysfs filesystem
Creating /dev
Creating inital device notes
Loading scsi_mod.ko module
insmod: kernel-module version mismatch
/lib/modules/3.2.57-1.el5.elrepo/scsi_mod.ko was compiled for kernel
version M?**=a while this kernel is version 3.2.57-1.el5.elrepo

The M?**=a is the cryptic part.  The ** are actually suppose to be boxes,
but I don't know how to make that system.

With the latest CentOS 5.X debug kernel (kernel-debug-2.6.18-371.6.1.el5)
when I issue those commands kdump starts and captures system information in
a vmcore file.

Does kernel-lt-3.2.57-1.el5.elrepo have the capability to capture kernel
dump information?  If so, any appearant indications of what I am doing
wrong.  If no, I will work on compiling my own 3.2.57 kernel with the
necessary parameters.  Any suggestions are appreciated.

Reading up on the details, it doesn't seem to strike me as a problem related to the lack of a debug kernel. kdump works fine on normal kernels on RHEL (debug kernels are just normal kernels with additional debug functionality added, that may help you track some strange kernel related bug, but also slows down the kernel as a result). Don't be confused with the kernel-debuginfo, which is also not needed on RHEL for a working kdump BTW. (It may be useful for doing post-mortem analysis of your vmcore files though)

Somehow it seems that for whatever reason the scsi_mod.ko kernel module inside the kdump initrd does not match the kernel you're booting once the system crashes. Which would be weird, as I assume the kernel boots fine and loads the same module without a glitch on a normal boot. The cryptic part looks really funny indeed.

I haven't tried -lt kernels myself, but it is possible that these kernels need different parameters to enable kdump. (on RHEL6 you can e.g. do crashkernel=auto, whereas RHEL5 needs specific offset and size based on your physical RAM).

So I would investigate what the kdump documentation of this specific kernel version instructs you to do, I would investigate the version information of this specific scsi_mod kernel (try modinfo) and I would investigate if there is something unusual going on with the kdump initrd that is created when you start /etc/init.d/kdump restart (you can delete it from /boot and have it recreate it if need be).

BTW To compare my -ml kernel to the official RHEL kernel, I can see Red Hat's crashkernel=auto implementation adds a specific CONFIG_KEXEC_AUTO_RESERVE config option:

----
[root@moria ~]# grep -C5 CONFIG_CRASH_DUMP 
/boot/config-2.6.32-431.11.2.el6.x86_64
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
CONFIG_KEXEC_AUTO_RESERVE=y
CONFIG_CRASH_DUMP=y
CONFIG_KEXEC_JUMP=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HOTPLUG_CPU=y
[root@moria ~]# grep -C5 CONFIG_CRASH_DUMP 
/boot/config-3.14.0-1.el6.elrepo.x86_64
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_SCHED_HRTICK=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_KEXEC_JUMP=y
CONFIG_PHYSICAL_START=0x1000000
CONFIG_RELOCATABLE=y
CONFIG_PHYSICAL_ALIGN=0x1000000
CONFIG_HOTPLUG_CPU=y
----

And the documentation at /usr/share/doc/kernel-ml-doc-3.14.0/Documentation/kdump/kdump.txt does not indicate crashkernel=auto is a valid option in kernel 3.14.0 (not sure if that is still correct though). At least it never was a valid option in RHEL5.

--
-- dag wieers, d...@wieers.com, http://dag.wieers.com/
-- dagit linux solutions, cont...@dagit.net, http://dagit.net/

[Any errors in spelling, tact or fact are transmission errors]
_______________________________________________
elrepo mailing list
elrepo@lists.elrepo.org
http://lists.elrepo.org/mailman/listinfo/elrepo

Reply via email to