Bug in 4.5.0: hard lockups on docker operations

2016-03-14 Thread Torsten Luettgert
Hello kernel hackers, I'm still getting hard lockups on my docker machine with overlayfs and 3ware RAID with linux 4.5.0. They look a bit different though. It would be great if someone could look into it. Kernel messages follow: NMI watchdog: Watchdog detected hard LOCKUP on cpu 6 Kernel panic -

BUG: hard lockups on docker operations

2016-03-11 Thread Torsten Luettgert
Hello kernel hackers, I'm getting hard lockups with kernel 4.4.5 (at least 4.4.3 and 4.4.4 also). This is a docker hypervisor machine with overlayfs and 3ware RAID. They usually happen when I do something with docker, here when I ran poweroff in a container (those are all VM-like containers which

Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160

2015-04-27 Thread Torsten Luettgert
On Wed, 22 Apr 2015 10:45:19 +0200 Torsten Luettgert wrote: > I'll keep this running, and if - when - it has been going for a week > (3 times the max uptime without the patch) - I'll give a final notice. So here it is, the final notice: 17:16:36 up 7 days, 4:07, 2 users, lo

Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160

2015-04-22 Thread Torsten Luettgert
On Mon, 20 Apr 2015 13:24:24 +0200 Torsten Luettgert wrote: > > Can you test the patch below? > > I'm running it right now and keeping my fingers crossed. Just under two days uptime now, and no crashes. I'm pretty sure you nailed it. I'll keep this running, and if

Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160

2015-04-20 Thread Torsten Luettgert
On Sun, 19 Apr 2015 18:58:41 +0200 Christoph Hellwig wrote: > This looks like a long standing bug in all three 3ware drivers to > me, that the taking the host lock around the host_busy manipulation > was hiding. > > Can you test the patch below? I'm running it right now and keeping my fingers c

Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160

2015-04-17 Thread Torsten Luettgert
On Fri, 17 Apr 2015 15:31:16 +0200 Torsten Luettgert wrote: > On Mon, 13 Apr 2015 20:28:29 +0200 > Torsten Luettgert wrote: > > Right now, I'm trying the problematic release, compiled with a newer > gcc (4.9.2-6 from Fedora, while using 4.4.7-11 from rhel6 > before). It

Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160

2015-04-17 Thread Torsten Luettgert
On Mon, 13 Apr 2015 20:28:29 +0200 Torsten Luettgert wrote: > On Mon, 13 Apr 2015 19:41:05 +0200 > Christoph Hellwig wrote: > > > Can you run gdb on your vmlinux file and send the output of the > > following command in gdb > > > > l *(scsi_dma_unmap+0x54) >

Re: BUG: unable to handle kernel NULL pointer deref, bisected to 746650160

2015-04-13 Thread Torsten Luettgert
On Mon, 13 Apr 2015 19:41:05 +0200 Christoph Hellwig wrote: > Can you run gdb on your vmlinux file and send the output of the > following command in gdb > > l *(scsi_dma_unmap+0x54) Thanks for looking into it! Here is what gdb says: Reading symbols from /opt/kvm/bisect/vmlinux-3.16.0-746650160

BUG: unable to handle kernel NULL pointer deref, bisected to 746650160

2015-04-08 Thread Torsten Luettgert
Hello, I'm getting NULL pointer deref BUGs on a Supermicro machine of mine since 3.17. It occurs at random uptimes, often a few hours after booting (max uptime was 2 days yet). I bisected the problem (took a while); the problematic commit seems to be 746650160866 (scsi: convert host_busy to atomi

BUG: incorrect MD5 hash value on x86_64 with TCP-MD5

2007-01-20 Thread Torsten Luettgert
Hello all, There's still a bug in the new TCP-MD5 feature. On x86_64, the hash function is fed wrong TCP payload content. The same kernel, same conf (except arch -> x86) on an Athlon doesn't have the problem. Kernel is a vanilla 2.6.20-rc5. I put debugging printk()s into tcp_v4_do_calc_md5_hash(

incorrect TCP checksum on sent TCP-MD5 packets (2.6.20-rc5)

2007-01-14 Thread Torsten Luettgert
Hi, I'm using the new TCP-MD5 option in 2.6.20-rc4 and rc5 to talk BGP to cisco routers. My box connects to the cisco, and the handshake looks fine: SYN, SYN/ACK, ACK all have md5 option and correct TCP checksums. All packets after that, i.e. the ones with payload data, have wrong TCP checksums,