Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-10-02 Thread Alex Vanderpol
Just want to let you know this bug can be closed now, I haven't once had 
the kernel crash due to Folding@Home since my last message and I've gone 
through several work units since then, all of which had been resumed 
several times through their progress. I think it's quite safe to say 
whatever bug was causing the kernel to crash back in 3.11-rc4 was fixed 
in 3.11-rc7 and remains fixed in the latest version.



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-09-02 Thread Alex Vanderpol
I would like to report that this issue seems to have been resolved in 
kernel 3.11-rc7, I am able to run Folding@Home without the kernel 
crashing. I do currently appear to be working on a different type of 
unit than I had been with the previous kernel, however, but it does 
appear to be using the same core (FahCore_a4) that had been causing the 
crashes.


This unit will probably take about 10 days or so to complete, as it's 
quite a large one, hopefully the next unit I receive is of a similar 
type to the ones I had been working on previously when the kernel was 
crashing any time Folding@Home attempted to resume the work units. I 
will check back in later with a status update, hopefully this issue has 
indeed been resolved and it's not just a fluke because of a different 
type of work unit.



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-17 Thread Ben Hutchings
On Fri, 2013-08-16 at 21:54 -0400, Alex Vanderpol wrote:
 Well, I've discovered why makedumpfile continues to run even after the 
 dump files show up in the folder. It's failing to properly dump the 
 kernel log, and is continually appending the line [ 0.00]  to the 
 dmesg file.

It sounds like the version of makedumpfile you're using doesn't
understand the structured log format introduced in Linux 3.5.  I guess
you need at least version 1.5.1-1, which has this changelog line:

  * Add --dmesg-fix from upstream 1.5.2 for kernels 3.5
and above

Ben.

-- 
Ben Hutchings
Teamwork is essential - it allows you to blame someone else.


signature.asc
Description: This is a digitally signed message part


Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-17 Thread Ben Hutchings
On Thu, 2013-08-15 at 20:01 -0400, Alex Vanderpol wrote:
 Apparently having the kernel image debug package installed is a good 
 idea when trying to do anything with crash dumps... After installing the 
 ~2GB (unpacked) package I was able to use the crash utility to analyze 
 (to a degree) the crash dump file made by kdump-tools, however I am 
 unable to extract the kernel log from the dump.
 
 When I run the 'log' command within crash I get this message:
  log: WARNING: log buf data structure(s) have changed
 
 I can, however, get a backtrace and the process status information from 
 the dump. If you think it would be useful, I can output what I am able 
 to get from crash to a file to send to you for you to look at.

Yes please.

 Also, I have a few questions:
[...]

Sorry, I don't know how to use kdump-tools myself.

Ben.

-- 
Ben Hutchings
Teamwork is essential - it allows you to blame someone else.


signature.asc
Description: This is a digitally signed message part


Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-17 Thread Alex Vanderpol
(In reply to your earlier email) The problem is that I'm already using 
version 1.5.4-1 from Unstable, and it's having that issue, so either 
something's been changed in the recent kernel version that broke it 
again, or makedumpfile has regressed since version 1.5.1-1. Either way 
I've already filed a bug about it, so hopefully the package maintainers 
will look into it.


(In reply to your later email) I've attached crash's output including 
the back trace and process status information, if there's anything else 
from crash (other than the unfortunately unobtainable crash log, which 
is dumped separately anyway) that you need, let me know and I'll see if 
I can get it to you.


As for the questions I had asked, don't worry about them, I ended up 
finding some helpful information some time a little later that helped me 
address some of those issues.

crash 7.0.1
Copyright (C) 2002-2013  Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
Copyright (C) 1999-2006  Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011  NEC Corporation
Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions.  Enter help copying to see the conditions.
This program has absolutely no warranty.  Enter help warranty for details.
 
NOTE: stdin: not a tty

GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type show copying
and show warranty for details.
This GDB was configured as x86_64-unknown-linux-gnu...


please wait... (gathering kmem slab cache data)


please wait... (gathering module symbol data)
  

please wait... (gathering task table data)
   

please wait... (determining panic task)

  KERNEL: /usr/lib/debug/vmlinux
DUMPFILE: /data/crashdumps/201308162051/dump.201308162051  [PARTIAL DUMP]
CPUS: 2
DATE: Fri Aug 16 20:50:41 2013
  UPTIME: 00:01:18
LOAD AVERAGE: 1.47, 0.63, 0.23
   TASKS: 185
NODENAME: Kara01
 RELEASE: 3.11-rc4-amd64
 VERSION: #1 SMP Debian 3.11~rc4-1~exp1 (2013-08-08)
 MACHINE: x86_64  (1296 Mhz)
  MEMORY: 3.9 GB
   PANIC: 
WARNING: log buf data structure(s) have changed

 PID: 1765
 COMMAND: FahCore_a4
TASK: 880137280800  [THREAD_INFO: 88013a41]
 CPU: 0
   STATE: TASK_RUNNING (PANIC)

PID: 1765   TASK: 880137280800  CPU: 0   COMMAND: FahCore_a4
 #0 [88013a4119f0] machine_kexec at 8103366c
 #1 [88013a411a50] crash_kexec at 8108f00e
 #2 [88013a411b08] oops_end at 8138f793
 #3 [88013a411b28] no_context at 81387f81
 #4 [88013a411b68] __do_page_fault at 81391a58
 #5 [88013a411c60] page_fault at 8138ee18
[exception RIP: jbd2_journal_file_inode+53]
RIP: a02af28c  RSP: 88013a411d10  RFLAGS: 00010246
RAX:   RBX: 880138fe0ec0  RCX: 0019
RDX: 880138fe0ec0  RSI:   RDI: 8801362553d8
RBP:    R8: 880136541218   R9: b923
R10: 0020  R11:   R12: 8801362553d8
R13: 8801362553d8  R14: 08cc  R15: 1000
ORIG_RAX:   CS: 0010  SS: 0018
 #6 [88013a411d30] ext4_block_zero_page_range at a02d9112 [ext4]
 #7 [88013a411d88] ext4_truncate at a02d9b5f [ext4]
 #8 [88013a411de8] ext4_setattr at a02da5b2 [ext4]
 #9 [88013a411e48] notify_change at 81128a87
#10 [88013a411eb8] do_truncate at 811134fa
#11 [88013a411f20] vfs_truncate at 81113656
#12 [88013a411f48] do_sys_truncate at 811137ce
#13 [88013a411f80] system_call_fastpath at 81393d29
RIP: 008c3797  RSP: 7fd67a8fda68  RFLAGS: 0246
RAX: 004c  RBX: 81393d29  RCX: 
RDX: 015cb3c0  RSI: 0007f8cc  RDI: 0170f590
RBP: 1020   R8: 00c00140   R9: 06e5
R10:   R11: 0206  R12: 0001
R13: 015c9900  R14: 015caa50  R15: 8801365fde40
ORIG_RAX: 004c  CS: 0033  SS: 002b
   PIDPPID  CPU   TASKST  %MEM VSZRSS  COMM
  0  0   0  81613400  RU   0.0   0  0  [swapper/0]
 0  

Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-16 Thread Alex Vanderpol
I've discovered that the kernel only crashes when the Folding@Home core 
attempts to resume a work unit already in progress. After configuring 
kdump-tools to not collect unused memory pages (something apparently 
recommended if your system has a larger amount of memory), I attempted 
to trigger a crash by starting the Folding@Home client service. However, 
after starting the service and waiting about 15 seconds or so (about how 
long it takes from starting the service to the kernel crashing), there 
was no crash. I waited a while longer, then checked on the work unit 
progress with FAHControl and noticed that Folding@Home had just 
downloaded and started a new work unit (that, thankfully, uses the same 
core as the previous one) as the previous unit had already been 
finished. I let it run all night without any issues, however when I 
powered off my laptop, booted it up again later and attempted to run 
Folding@Home again, the kernel crashed upon Folding@Home trying to 
resume the work unit.


I do seem to have a problem getting crash dump collection to work, 
though. The dmesg file collected with this latest crash dump (which is 
apparently where the kernel log gets dumped, separate from the dump 
file) only contains the line [0.00]  precisely 15447298 times 
(according to nano's line count upon opening the file). Clearly 
something is broken, but I do not know what exactly it is. (Previous 
attempts at crash dump collection did not even give me a plain text 
dmesg file, for some reason they were being saved as binary files, and I 
was unable to read them.)


I may continue trying to get a proper crash dump with a proper kernel 
log dump with actual information in it so you have something to look at, 
though at this rate I'm about ready to give up. If you'd like what I 
have managed to get so far, useless dmesg file and all, I've packaged it 
up as a 277.4 MB .tar.gz I could send or upload to a file hosting site 
for you to download.


If I do actually manage to get a kernel log with something useful in it 
(or, at least, something other than the same line repeated millions of 
times over) I will send that to you as soon as I can.



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-16 Thread Alex Vanderpol
Well, I've discovered why makedumpfile continues to run even after the 
dump files show up in the folder. It's failing to properly dump the 
kernel log, and is continually appending the line [ 0.00]  to the 
dmesg file. I just ended up with a 3 GB file, nano ended up kaput trying 
to open it so I have no idea how many times it ended up printing that 
line into the file. I am going to file a bug on makedumpfile about this 
and hopefully it can be resolved, until then I am ceasing my dump 
collection attempts.


(That said, the main dump file seems to be in alright shape, so some 
information may be able to be gleaned from that... I've archived my 
latest crash dump without the unnecessarily large, useless dmesg file 
and the total size comes to 107.4 MB, if your mail server can handle a 
file this size and you feel there may be something useful in the dump I 
can send the file with my next message.)


(Also, apparently I was wrong about the previous dmesg dump files being 
saved as binary files, apparently that was a permissions-related issue, 
as they can only be viewed as root. Attempting to view them as a 
non-root user seems to mistakenly identify them as binary files rather 
than plain-text.)



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-15 Thread Alex Vanderpol
Apparently having the kernel image debug package installed is a good 
idea when trying to do anything with crash dumps... After installing the 
~2GB (unpacked) package I was able to use the crash utility to analyze 
(to a degree) the crash dump file made by kdump-tools, however I am 
unable to extract the kernel log from the dump.


When I run the 'log' command within crash I get this message:
log: WARNING: log buf data structure(s) have changed

I can, however, get a backtrace and the process status information from 
the dump. If you think it would be useful, I can output what I am able 
to get from crash to a file to send to you for you to look at.


Also, I have a few questions:

1) Is there any way at all to give the crash kernel more memory to work 
with? kdump-tools does not work if I specify an amount greater than 128M 
in the bootloader config file (the system does not reboot into the crash 
kernel), which seems unnecessarily small for a system with 4GB of memory 
available, and it seems like the small amount of memory available slows 
things down considerably.


2) How long should the crash dump collection process normally take? I've 
noticed that it usually takes about 4 or 5 minutes after the system 
finishes rebooting for the dump and dmesg files to show up in the crash 
dump folder specified (prior to which there's only one file, 
dump_incomplete), however the makedumpfile process seems to continue 
running even after 5 hours (watching it with top).


3) Is the system supposed to boot as normal when booting into the crash 
kernel to collect the crash dump? I ask, because mine does, and the 
severely limited amount of memory available doesn't seem to allow for a 
full boot. (I suspect I may need to specifically tell kdump-tools to 
boot into a more suitable runlevel, as it doesn't appear to do so on its 
own.)



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-14 Thread Ben Hutchings
On Tue, 2013-08-13 at 19:55 -0400, Alex Vanderpol wrote:
 I'll send the kernel log, though there's no record of the crash anywhere 
 in the log and I can't really see anything in the log that would be 
 useful...
[...]

You've sent /var/log/kern.log which I didn't expect would include any
useful information.

I meant that you should extract the kernel log from the crash dump.
Unless you already tried it and this is what you meant when you said
'the dmesg dump is 2.9 GB' (it shouldn't be nearly that large...)

Ben.

-- 
Ben Hutchings
Man invented language to satisfy his deep need to complain. - Lily Tomlin


--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-14 Thread Alex Vanderpol
Ah, I didn't know exactly what you meant. Unfortunately I don't know how 
to extract anything from the dump files I got with kdump-tools. There 
are two files in the crash dump directory I made and pointed kdump-tools 
to, dmesg.201308111839 (which is the 2.9 GB file) and dump.201308111839 
(the 1.5 GB file). I cannot seem to find anything useful with Google 
about what to do with these files.


I'm pretty sure the Debian-supplied kernel is configured to work with 
kdump-tools (at least, the default configuration state in the sources 
was configured correctly for such, and I did get a kernel dump), but I 
do not have a debug kernel image available, which I'm assuming would 
probably make this easier.


I'm going to look into trying to set things up better so I can hopefully 
get a crash dump I can actually do something with, I found a site that 
has some useful information and maybe I can get something that's 
actually useful, though if you know how to work with those files I 
mentioned above, I'll gladly take that information as well.



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-13 Thread Ben Hutchings
On Sun, 2013-08-11 at 19:48 -0400, Alex Vanderpol wrote:
 I have to ask: Is it normal for a crash dump (and, apparently, a dmesg 
 dump as well) to be several GB in size? I ask, because my dump file from 
 the crash is 1.5 GB and the dmesg dump is 2.9 GB.

Yes, I'm afraid so.

 I would like to submit these somehow but I don't think via email would 
 be the best way to do so, and I can't find any good, free file hosting 
 sites that will accept files this large. Would anyone have anny 
 suggestions as to what to do with them?

I don't think Debian has any regular arrangement for this at the moment.
And anyway, this will need to be forwarded upstream once we have a rough
idea of where the bug lies.

You could start by sending just the kernel log; that might be enough
information to make some progress.

Ben.

-- 
Ben Hutchings
Experience is what causes a person to make new mistakes instead of old ones.


signature.asc
Description: This is a digitally signed message part


Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-13 Thread Alex Vanderpol
I was unable to get a photo of the screen output, apparently neither of 
the cameras I have available can take high enough resolution shots to 
actually read the output even slightly, so I carefully wrote down 
(nearly) everything that was displayed and carefully typed it out into a 
text file (formatting may not be *exactly* as was on screen, but should 
be close enough) to send to you. The only thing not written down/typed 
out was the large list of kernel modules linked to (as I didn't think it 
was necessary), though if needed I can easily enough crash the kernel 
again to get that list for you.


During my writing out of the terminal output I came to understand, due 
to its specifically being referenced in the output, that it's not the 
Folding@Home client service itself that's crashing the kernel, but the 
Folding@Home core (specifically, FahCore_a4, as you'll see in the 
terminal output) that's causing the kernel crash.


Anyway, I hope this might help shed some light on the problem.
BUG: Unable to handle kernel NULL pointer dereference at (null)
IP: [a029728c] jbd2_journal_file_inode+0x35/0xdd [jbd2]
PGD 1399c1067 PUD 1399b6067 PMD 0
Oops:  [#1] SMP
Modules linked in: [long list of modules]
CPU: 0 PID: 1963 Comm: FahCore_a4 Tainted: G   I  3.11-rc4-amd64 #1 Debian 
3.11~rc4-1~exp1
Hardware name: Acer Aspire 1810TZ/JM11-MS, BIOS v1.3314 08/31/2010
task: 880139a40801 ti: 880139a4 task.ti: 880139a4
RIP: 0010:[a029728c]  [a029728c] 
jbd2_journal_file_inode+0x35/0xdd [jbd2]
RSP: 0018:880139a41d10 EFLAGS:00010246
RAX:  RBX: 880138f7e1c0 RCX: 0019
RDX: 880138f7e1c0 RSI:  RDI: 8801382c4408
RBP:  R08: 8801382cf218 R09: 0020
R10:  R11:  R12: 8801382c4408
R13: 8801328c4408 R14: 08cc R15: 1000
FS: 7f595ead8700() GS: 88013fc0() knlGS: 
CS: 0010 DS:  ES:   CR0: 80050033
CR2:  CR3: 000139993000 CR4: 000407f0
Stack:
  ea000404b728 8801382cf0b0 0734 8801382c4408
  a02b1112 1000 007f 08cc
  8801382cb540 8801382cf0b0 880139a41de0 8801382c4408
Call Trace:
  [a02b1112] ? ext4_block_zero_page_range+0x28b/0x29c [ext4]
  [a02b1bff] ? ext4_truncate+0x152/0x27f [ext4]
  [8111d48e] ? walk_component+0x163/0x1a2
  [8112a22c] ? mntget+0x17/0x1c
  [811287b9] ? inode_change+0x2c/0x11a
  [a02b25b2] ? ext4_setattr+0x412/0x4b2 [ext4]
  [81046971] ? current_fs_time+0x2f/0x35
  [81128a87] ? notify_change+0x1e0/0x2cc
  [81103b75] ? kmem_cache_free+0x3f/0x7c
  [811134fa] ? do_truncate+0x63/0x87
  [81113656] ? vfs_truncate+0xe6/0x10d
  [811137ce] ? do_sys_truncate+0x3d/0x77
  [81393d29] ? system_call_fastpath+0x16/0x1b
Code: f5 53 48 8b 1f 48 85 db 75 11 be 49 09 00 00 48 c7 c7 14 03 2a a0 e8 70 
c7 da e0 4c 89 e7 e8 3f df ff ff 85 c0 0f 85 98 00 00 48 39 5d 00 4c 8b 2b 0f 
84 92 00 00 00 48 39 5d 08 0f 84 88 00
RIP [a029728c] jbd2_journal_file_inode+0x35/0xdd [jbd2]
 RSP 880139a41d10
CR2: 
---[ end trace 0b57ed6584cd4409 ]---


Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-11 Thread Alex Vanderpol
So I figured out what's crashing the kernel, apparently kernel 3.11-rc4 
and Folding@Home (when run as a system service) don't get along. I 
suspect this may be an issue with Folding@Home rather than the kernel, I 
may need to get in touch with them and inform them of this issue so it 
can be resolved.



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-11 Thread Ben Hutchings
On Sun, 2013-08-11 at 07:40 -0400, Alex Vanderpol wrote:
 So I figured out what's crashing the kernel, apparently kernel 3.11-rc4 
 and Folding@Home (when run as a system service) don't get along. I 
 suspect this may be an issue with Folding@Home rather than the kernel, I 
 may need to get in touch with them and inform them of this issue so it 
 can be resolved.

It is a kernel bug; no application should be able to crash the kernel
(unless it's run with special privileges).

Ben.

-- 
Ben Hutchings
For every complex problem
there is a solution that is simple, neat, and wrong.


signature.asc
Description: This is a digitally signed message part


Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-11 Thread Alex Vanderpol
Oh, well, in that case, I guess reporting it was a good idea then. I can 
probably capture a crash dump some time later, if you need it, right now 
though I need to get some rest.


On 11/08/13 07:50 AM, Ben Hutchings wrote:

On Sun, 2013-08-11 at 07:40 -0400, Alex Vanderpol wrote:

So I figured out what's crashing the kernel, apparently kernel 3.11-rc4
and Folding@Home (when run as a system service) don't get along. I
suspect this may be an issue with Folding@Home rather than the kernel, I
may need to get in touch with them and inform them of this issue so it
can be resolved.

It is a kernel bug; no application should be able to crash the kernel
(unless it's run with special privileges).

Ben.




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#719277: linux-image-3.11-rc4-amd64: Kernel crashes when running Folding@Home as a system service

2013-08-11 Thread Alex Vanderpol
I have to ask: Is it normal for a crash dump (and, apparently, a dmesg 
dump as well) to be several GB in size? I ask, because my dump file from 
the crash is 1.5 GB and the dmesg dump is 2.9 GB.


I would like to submit these somehow but I don't think via email would 
be the best way to do so, and I can't find any good, free file hosting 
sites that will accept files this large. Would anyone have anny 
suggestions as to what to do with them?



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org