On 2013-10-25 4:02 AM, John Wright wrote:
On Sun, Oct 20, 2013 at 10:25:31PM -0400, Alex Vanderpol wrote:
Unfortunately I no longer have any of those crash dumps available to
send you anything, I had sent what I had gotten to the kernel
maintainers previously in an attempt to track down the cause of the
crashing, which I don't believe was ever figured out exactly but was
ultimate fixed in a later kernel version. I don't know if they would
still happen to have it on hand or not, though.

For what it's worth, there never was a vmcore file created any time
I did get a dump, instead I always got two separate files, one which
is the main core dump and one which is supposed to be the dmesg log
dump which unfortunately was never actually able to be dumped (the
issue I filed this bug report about). If the end result is supposed
to be one vmcore file, I suspect the inability of makedumpfile to
dump the kernel dmesg log prohibited it from combining the two files
into one file.
It's always two separate files.  They are not meant to be combined - the
dmesg dump is just intended for convenience (you can just read the file
as text instead of opening a dump with crash).

Using the 'log' command from within crash was ultimately useless as
well, as the kernel log wasn't dumped, therefore there wasn't any
log for crash to open.

This issue was with kernel 3.11-rc4-amd64 in its stock configuration.
Not a Debian package?  I'm not sure what you mean when you say stock
configuration.  Do you mean you ran 'make defconfig' to generate the
kernel .config?
What I meant was that it was the kernel image as supplied in the Debian repos, without any custom changes of any sort made. I'm sorry if I confused you, the correct terminology for some things eludes me at times.
I hope what information I am able to give you proves to be at least
somewhat useful.
I'm not really sure what you saw. :-/  I'll see if I can reproduce
anything with linux-image-3.11-1-amd64_3.11.5-1_amd64 when I have some
free time (I lost the VM I use for testing this stuff).  It's possible
there was a short-lived bug in the kernel itself, causing some corrupt
representation of its log buffer.
I am quite sorry I can't be of any real help here. If I had thought they might be necessary at all for this particular bug I would have held onto what I got from the crash dumps, but once the bug I was having with the kernel was resolved with a later kernel version, and since I'd already sent a copy of what I got to the kernel maintainers prior to the newer release, I didn't think I needed them anymore and removed them as part of my routine cleanup.

It is quite possible that, as the kernel in question was an RC build, this issue may have been just one more kink that was ultimately smoothed out in the later builds, along with whatever was causing the kernel to crash on me whenever Folding@Home tried to resume its current work unit.

Digging back to one of my messages on the kernel bug I filed at the time, I mentioned to the kernel maintainers that:

        When I run the 'log' command within crash I get this message:
            "log: WARNING: log buf data structure(s) have changed"

and of course, the separate log dump file issue this bug is about.

The first message in the kernel bug thread when I ran into this problem with makedumpfile is here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=719277#65 if you want to have a look, I don't know if there's anything that might be useful there as it wasn't long after that I filed this bug.
On 2013-10-20 10:29 PM, John Wright wrote:
Hi Alex,

On Fri, Aug 16, 2013 at 10:12:39PM -0400, Alex Vanderpol wrote:
Package: makedumpfile
Version: 1.5.4-1
Severity: grave
Justification: renders package unusable

Dear Maintainer,

There seems to be a serious issue with makedumpfile that causes it to fail to
dump the kernel log when collecting crash dump information. Instead, the
program continues to run indefinitely, continually appending the line "[
0.000000] " to the file as it seems to attempt to dump the log, which, if left
alone for any considerable length of time, can rapidly result in a very large,
entirely useless dmesg dump file.

I have been trying to collect crash dump information for a crash that's
triggered whenever Folding@Home's FahCore_a4 attempts to resume an in-progress
work unit, however, every crash dump I've collected has had this problem. The
main dump file seems to be dumped without a problem (though crash identifies it
as a partial dump, possibly due to the kernel log being dumped into a separate
file).

I hope you can look into this issue and hopefully it can be sorted out soon.
Sorry for the long delay in my response.  This seems like a serious but
not actually grave issue, since the core dump does actually exist (even
though you have to interrupt the dmesg extraction).  crash identifies
the dump as a partial dump because we explicitly ignore zero pages and
userspace pages.  Within crash, you should be able to use the 'log'
command to get the most recent log messages before the crash...assuming
crash doesn't break in the same way makedumpfile does.

I will try to reproduce this, but I worry the problem might be somewhat
specific either to your crash or some other part of your configuration.
Would you feel comfortable making the vmcore available to me?  It would
also help to know the exact kernel version, and access to a dbg package
if it's not a stock kernel.

Sorry for the issue and thanks for the report!


--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to