Re: [DRBD-user] Page allocation errors and kernel panics with drbd 8.3.3rc1 and infiniband

2009-10-05 Thread Lars Ellenberg
On Mon, Oct 05, 2009 at 11:49:11AM +0200, Lars Ellenberg wrote:
> It has different context, but that thread may give you an idea on
> how to track it down further: turn on slab debug,
> then sample /proc/slabinfo, /proc/slab_allocators, /proc/net/sockstat,
> and maybe similar statistics in the infiniband area.

And if it turns out to be too hard to track down,
and it is actually "only" failing allocations from interrupt context
(or other "tight" code pathes), try reserve some megabytes for those
code pathes.

You seem to have pleanty of RAM araound (24 GiB iirc),
so why not simply reserve 128 MiB?
echo $[128 << 10] > /proc/sys/vm/min_free_kbytes

 ;)

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Page allocation errors and kernel panics with drbd 8.3.3rc1 and infiniband

2009-10-05 Thread Lars Ellenberg
On Mon, Oct 05, 2009 at 10:41:23AM +0200, Lars Ellenberg wrote:
> On Sun, Oct 04, 2009 at 10:14:22PM +0200, Lars Ellenberg wrote:
> > On Sun, Oct 04, 2009 at 03:55:44AM -0400, Gennadiy Nerubayev wrote:
> > > On Tue, Sep 22, 2009 at 5:01 PM, Jason McKay  
> > > wrote:
> > > 
> > > > On Sep 22, 2009, at 4:34 PM, Lars Ellenberg wrote:
> > > >
> > > > > But correcting the tcp_mem setting above
> > > > > is more likely to fix your symptoms.
> > > >
> > > > I suspect it will.  We'll test and follow up.
> > > >
> > > 
> > > Hi guys,
> > > 
> > > Unfortunately these are still occurring, even after we've updated to rc3,
> > > and used the tuning settings from rc3 notes (prior to this % of memory in
> > > pages were attempted with same results). They are a lot less frequent
> > > (intervals measured in hours), and have not yet caused a panic, but of
> > > course the worry is that it may happen regardless. Anything else that we
> > > could try here to eliminate it completely? Is there any chance that the
> > > ipoib stack is at fault?
> > 
> > Possibly.
> > Maybe Vlad knows more?
> >  From 
> > http://www.openfabrics.org/txt/documentation/linux/EWG_meeting_minutes/12_01_08.txt:
> > 1419maj v...@mellanox   Iperf-2.0.4 fails: page allocation 
> > failure. order:5
> > I guess that means https://bugs.openfabrics.org/show_bug.cgi?id=1419
> > Not much progress on that bug, though.
> 
> This appears related, as well:
> http://bugzilla.kernel.org/show_bug.cgi?id=10890
> 
> Though there it was claimed that leaving network sysctls at the defaults
> "solved" the issue.

And yet one more, where sysctls helped:
http://thread.gmane.org/gmane.linux.nfs/20761/focus=695707

It has different context, but that thread may give you an idea on
how to track it down further: turn on slab debug,
then sample /proc/slabinfo, /proc/slab_allocators, /proc/net/sockstat,
and maybe similar statistics in the infiniband area.

BTW, maybe your netdev_max_backlog is a bit excessive?


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Page allocation errors and kernel panics with drbd 8.3.3rc1 and infiniband

2009-10-05 Thread Lars Ellenberg
On Sun, Oct 04, 2009 at 10:14:22PM +0200, Lars Ellenberg wrote:
> On Sun, Oct 04, 2009 at 03:55:44AM -0400, Gennadiy Nerubayev wrote:
> > On Tue, Sep 22, 2009 at 5:01 PM, Jason McKay  wrote:
> > 
> > > On Sep 22, 2009, at 4:34 PM, Lars Ellenberg wrote:
> > >
> > > > But correcting the tcp_mem setting above
> > > > is more likely to fix your symptoms.
> > >
> > > I suspect it will.  We'll test and follow up.
> > >
> > 
> > Hi guys,
> > 
> > Unfortunately these are still occurring, even after we've updated to rc3,
> > and used the tuning settings from rc3 notes (prior to this % of memory in
> > pages were attempted with same results). They are a lot less frequent
> > (intervals measured in hours), and have not yet caused a panic, but of
> > course the worry is that it may happen regardless. Anything else that we
> > could try here to eliminate it completely? Is there any chance that the
> > ipoib stack is at fault?
> 
> Possibly.
> Maybe Vlad knows more?
>  From 
> http://www.openfabrics.org/txt/documentation/linux/EWG_meeting_minutes/12_01_08.txt:
> 1419  maj v...@mellanox   Iperf-2.0.4 fails: page allocation failure. 
> order:5
> I guess that means https://bugs.openfabrics.org/show_bug.cgi?id=1419
> Not much progress on that bug, though.

This appears related, as well:
http://bugzilla.kernel.org/show_bug.cgi?id=10890

Though there it was claimed that leaving network sysctls at the defaults
"solved" the issue.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Page allocation errors and kernel panics with drbd 8.3.3rc1 and infiniband

2009-10-04 Thread Lars Ellenberg
On Sun, Oct 04, 2009 at 03:55:44AM -0400, Gennadiy Nerubayev wrote:
> On Tue, Sep 22, 2009 at 5:01 PM, Jason McKay  wrote:
> 
> > On Sep 22, 2009, at 4:34 PM, Lars Ellenberg wrote:
> >
> > > But correcting the tcp_mem setting above
> > > is more likely to fix your symptoms.
> >
> > I suspect it will.  We'll test and follow up.
> >
> 
> Hi guys,
> 
> Unfortunately these are still occurring, even after we've updated to rc3,
> and used the tuning settings from rc3 notes (prior to this % of memory in
> pages were attempted with same results). They are a lot less frequent
> (intervals measured in hours), and have not yet caused a panic, but of
> course the worry is that it may happen regardless. Anything else that we
> could try here to eliminate it completely? Is there any chance that the
> ipoib stack is at fault?

Possibly.
Maybe Vlad knows more?
 From 
http://www.openfabrics.org/txt/documentation/linux/EWG_meeting_minutes/12_01_08.txt:
1419maj v...@mellanox   Iperf-2.0.4 fails: page allocation failure. 
order:5
I guess that means https://bugs.openfabrics.org/show_bug.cgi?id=1419
Not much progress on that bug, though.

-- 
: Lars Ellenberg
: LINBIT HA-Solutions GmbH
: DRBD®/HA support and consultinghttp://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Page allocation errors and kernel panics with drbd 8.3.3rc1 and infiniband

2009-10-04 Thread Gennadiy Nerubayev
On Tue, Sep 22, 2009 at 5:01 PM, Jason McKay  wrote:

> On Sep 22, 2009, at 4:34 PM, Lars Ellenberg wrote:
>
> > But correcting the tcp_mem setting above
> > is more likely to fix your symptoms.
>
> I suspect it will.  We'll test and follow up.
>

Hi guys,

Unfortunately these are still occurring, even after we've updated to rc3,
and used the tuning settings from rc3 notes (prior to this % of memory in
pages were attempted with same results). They are a lot less frequent
(intervals measured in hours), and have not yet caused a panic, but of
course the worry is that it may happen regardless. Anything else that we
could try here to eliminate it completely? Is there any chance that the
ipoib stack is at fault?

Thanks!

-Gennadiy
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Page allocation errors and kernel panics with drbd 8.3.3rc1 and infiniband

2009-09-23 Thread Lars Ellenberg
On Tue, Sep 22, 2009 at 05:01:00PM -0400, Jason McKay wrote:
> > But correcting the tcp_mem setting above
> > is more likely to fix your symptoms.
> 
> I suspect it will.  We'll test and follow up.
> 
> Much thanks for the quick reply.

We try to please.
Much more so for partners or feature-sponsors such as you ;)


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list   --   I'm subscribed
___
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user


Re: [DRBD-user] Page allocation errors and kernel panics with drbd 8.3.3rc1 and infiniband

2009-09-22 Thread Jason McKay

On Sep 22, 2009, at 4:34 PM, Lars Ellenberg wrote:

> On Tue, Sep 22, 2009 at 01:31:26PM -0400, Jason McKay wrote:
>> Hello all,
>>
>> We're experiencing page allocation errors when writing to a connected
>> drbd device (connected via infiniband using IPoIB) that resulted in a
>> kernel panic last night.
>>
>> When rsyncing data to the primary node, we get a slew of these
>> errors in /var/log/messages:
>>
>> Sep 21 22:11:21 client-nfs5 kernel: drbd0_worker: page allocation
>> failure. order:5, mode:0x10
>> Sep 21 22:11:21 client-nfs5 kernel: Pid: 6114, comm: drbd0_worker
>> Tainted: P  2.6.27.25 #2
>> Sep 21 22:11:21 client-nfs5 kernel:
>> Sep 21 22:11:21 client-nfs5 kernel: Call Trace:
>> Sep 21 22:11:21 client-nfs5 kernel:  []
>> __alloc_pages_internal+0x3a4/0x3c0
>> Sep 21 22:11:21 client-nfs5 kernel:  []
>> kmem_getpages+0x6b/0x12b
>> Sep 21 22:11:21 client-nfs5 kernel:  []
>> fallback_alloc+0x11d/0x1b1
>> Sep 21 22:11:21 client-nfs5 kernel:  []
>> kmem_cache_alloc_node+0xa3/0xcf
>> Sep 21 22:11:21 client-nfs5 kernel:  []
>> __alloc_skb+0x64/0x12e
>> Sep 21 22:11:21 client-nfs5 kernel:  []
>> sk_stream_alloc_skb+0x2f/0xd5
>> Sep 21 22:11:21 client-nfs5 kernel:  []
>> tcp_sendmsg+0x180/0x9d1
>> Sep 21 22:11:21 client-nfs5 kernel:  []
>> sock_sendmsg+0xe2/0xff
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> autoremove_wake_function+0x0/0x2e
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> autoremove_wake_function+0x0/0x2e
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> zone_statistics+0x3a/0x5d
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> kernel_sendmsg+0x2c/0x3e
>> Sep 21 22:11:22 client-nfs5 kernel:  [] drbd_send
>> +0xb2/0x194 [drbd]
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> _drbd_send_cmd+0x9c/0x116 [drbd]
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> send_bitmap_rle_or_plain+0xd7/0x13a [drbd]
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> _drbd_send_bitmap+0x18d/0x1ae [drbd]
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> drbd_send_bitmap+0x39/0x4c [drbd]
>
> I certainly would not expect to _cause_
> memory pressure from this call path.
> But someone is causing it, and we are affected.
>
>
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> w_bitmap_io+0x45/0x95 [drbd]
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> drbd_worker+0x230/0x3eb [drbd]
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> drbd_thread_setup+0x124/0x1ba [drbd]
>> Sep 21 22:11:22 client-nfs5 kernel:  [] child_rip
>> +0xa/0x11
>> Sep 21 22:11:22 client-nfs5 kernel:  []
>> drbd_thread_setup+0x0/0x1ba [drbd]
>> Sep 21 22:11:22 client-nfs5 kernel:  [] child_rip
>> +0x0/0x11
>
> why is the tcp stack trying to allocate order:5 pages?
>
> thats 32 continguous pages, 128 KiB continguous memory.  apparently
> your
> memory is so much fragmented that this is no longer available.
> but why is the tcp stack ok with order:0 or order:1 pages?
> why does it have to be order:5 ?
>
>> These were occurring for hours until a kernel panic:
>>
>> [Mon Sep 21 22:11:33 2009]INFO: task kjournald:7025 blocked for
>> more than 120 seconds.
>> [Mon Sep 21 22:11:33 2009]"echo 0 > /proc/sys/kernel/
>> hung_task_timeout_secs" disables this message.
>> [Mon Sep 21 22:11:33 2009] 88062e801d30 0046
>> 88062e801cf8 a056b821
>> [Mon Sep 21 22:11:33 2009] 880637d617c0 88063dd05120
>> 88063e668750 88063dd05478
>> [Mon Sep 21 22:11:33 2009] 00090002 00011a942a2d
>>  
>> [Mon Sep 21 22:11:33 2009]Call Trace:
>> [Mon Sep 21 22:11:33 2009] [] drbd_unplug_fn
>> +0x14a/0x1aa [drbd]
>> [Mon Sep 21 22:11:33 2009] [] sync_buffer+0x0/0x3f
>> [Mon Sep 21 22:11:33 2009] [] io_schedule+0x5d/0x9f
>> [Mon Sep 21 22:11:33 2009] [] sync_buffer+0x3b/0x3f
>> [Mon Sep 21 22:11:33 2009] [] __wait_on_bit
>> +0x40/0x6f
>> [Mon Sep 21 22:11:33 2009] [] sync_buffer+0x0/0x3f
>> [Mon Sep 21 22:11:34 2009] []
>> out_of_line_wait_on_bit+0x6c/0x78
>> [Mon Sep 21 22:11:34 2009] [] wake_bit_function
>> +0x0/0x23
>> [Mon Sep 21 22:11:34 2009] []
>> journal_commit_transaction+0x7cd/0xc4d [jbd]
>> [Mon Sep 21 22:11:34 2009] [] lock_timer_base
>> +0x26/0x4b
>> [Mon Sep 21 22:11:34 2009] [] kjournald
>> +0xc1/0x1fb [jbd]
>> [Mon Sep 21 22:11:34 2009] []
>> autoremove_wake_function+0x0/0x2e
>> [Mon Sep 21 22:11:34 2009] [] kjournald+0x0/0x1fb
>> [jbd]
>> [Mon Sep 21 22:11:34 2009] [] kthread+0x47/0x73
>> [Mon Sep 21 22:11:34 2009] [] child_rip+0xa/0x11
>> [Mon Sep 21 22:11:34 2009] [] kthread+0x0/0x73
>> [Mon Sep 21 22:11:34 2009] [] child_rip+0x0/0x11
>> [Mon Sep 21 22:11:34 2009]
>> [Mon Sep 21 22:11:34 2009]Kernel panic - not syncing: softlockup:
>> blocked tasks
>> [-- r...@localhost.localdomain attached -- Mon Sep 21 22:17:24 2009]
>> [-- r...@localhost.localdomain detached -- Mon Sep 21 23:16:22 2009]
>> [-- Console down -- Tue Sep 22 04:02:15 2009]
>
> That is unfortunate.
>
>> The two systems are hardware and OS identical:
>>
>> [r...@client-nfs5 log]# uname -a
>> Linux client-nfs5 2.6.27.25 #2 SMP Fri Jun 26 00:07

Re: [DRBD-user] Page allocation errors and kernel panics with drbd 8.3.3rc1 and infiniband

2009-09-22 Thread Lars Ellenberg
On Tue, Sep 22, 2009 at 01:31:26PM -0400, Jason McKay wrote:
> Hello all,
> 
> We're experiencing page allocation errors when writing to a connected
> drbd device (connected via infiniband using IPoIB) that resulted in a
> kernel panic last night.
> 
> When rsyncing data to the primary node, we get a slew of these errors in 
> /var/log/messages:
> 
> Sep 21 22:11:21 client-nfs5 kernel: drbd0_worker: page allocation failure. 
> order:5, mode:0x10
> Sep 21 22:11:21 client-nfs5 kernel: Pid: 6114, comm: drbd0_worker Tainted: P  
> 2.6.27.25 #2
> Sep 21 22:11:21 client-nfs5 kernel:
> Sep 21 22:11:21 client-nfs5 kernel: Call Trace:
> Sep 21 22:11:21 client-nfs5 kernel:  [] 
> __alloc_pages_internal+0x3a4/0x3c0
> Sep 21 22:11:21 client-nfs5 kernel:  [] 
> kmem_getpages+0x6b/0x12b
> Sep 21 22:11:21 client-nfs5 kernel:  [] 
> fallback_alloc+0x11d/0x1b1
> Sep 21 22:11:21 client-nfs5 kernel:  [] 
> kmem_cache_alloc_node+0xa3/0xcf
> Sep 21 22:11:21 client-nfs5 kernel:  [] 
> __alloc_skb+0x64/0x12e
> Sep 21 22:11:21 client-nfs5 kernel:  [] 
> sk_stream_alloc_skb+0x2f/0xd5
> Sep 21 22:11:21 client-nfs5 kernel:  [] 
> tcp_sendmsg+0x180/0x9d1
> Sep 21 22:11:21 client-nfs5 kernel:  [] 
> sock_sendmsg+0xe2/0xff
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> autoremove_wake_function+0x0/0x2e
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> autoremove_wake_function+0x0/0x2e
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> zone_statistics+0x3a/0x5d
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> kernel_sendmsg+0x2c/0x3e
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> drbd_send+0xb2/0x194 [drbd]
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> _drbd_send_cmd+0x9c/0x116 [drbd]
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> send_bitmap_rle_or_plain+0xd7/0x13a [drbd]
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> _drbd_send_bitmap+0x18d/0x1ae [drbd]
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> drbd_send_bitmap+0x39/0x4c [drbd]

I certainly would not expect to _cause_
memory pressure from this call path.
But someone is causing it, and we are affected.


> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> w_bitmap_io+0x45/0x95 [drbd]
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> drbd_worker+0x230/0x3eb [drbd]
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> drbd_thread_setup+0x124/0x1ba [drbd]
> Sep 21 22:11:22 client-nfs5 kernel:  [] child_rip+0xa/0x11
> Sep 21 22:11:22 client-nfs5 kernel:  [] 
> drbd_thread_setup+0x0/0x1ba [drbd]
> Sep 21 22:11:22 client-nfs5 kernel:  [] child_rip+0x0/0x11

why is the tcp stack trying to allocate order:5 pages?

thats 32 continguous pages, 128 KiB continguous memory.  apparently your
memory is so much fragmented that this is no longer available.
but why is the tcp stack ok with order:0 or order:1 pages?
why does it have to be order:5 ?

> These were occurring for hours until a kernel panic:
> 
> [Mon Sep 21 22:11:33 2009]INFO: task kjournald:7025 blocked for more than 120 
> seconds.
> [Mon Sep 21 22:11:33 2009]"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> disables this message.
> [Mon Sep 21 22:11:33 2009] 88062e801d30 0046 88062e801cf8 
> a056b821
> [Mon Sep 21 22:11:33 2009] 880637d617c0 88063dd05120 88063e668750 
> 88063dd05478
> [Mon Sep 21 22:11:33 2009] 00090002 00011a942a2d  
> 
> [Mon Sep 21 22:11:33 2009]Call Trace:
> [Mon Sep 21 22:11:33 2009] [] drbd_unplug_fn+0x14a/0x1aa 
> [drbd]
> [Mon Sep 21 22:11:33 2009] [] sync_buffer+0x0/0x3f
> [Mon Sep 21 22:11:33 2009] [] io_schedule+0x5d/0x9f
> [Mon Sep 21 22:11:33 2009] [] sync_buffer+0x3b/0x3f
> [Mon Sep 21 22:11:33 2009] [] __wait_on_bit+0x40/0x6f
> [Mon Sep 21 22:11:33 2009] [] sync_buffer+0x0/0x3f
> [Mon Sep 21 22:11:34 2009] [] 
> out_of_line_wait_on_bit+0x6c/0x78
> [Mon Sep 21 22:11:34 2009] [] wake_bit_function+0x0/0x23
> [Mon Sep 21 22:11:34 2009] [] 
> journal_commit_transaction+0x7cd/0xc4d [jbd]
> [Mon Sep 21 22:11:34 2009] [] lock_timer_base+0x26/0x4b
> [Mon Sep 21 22:11:34 2009] [] kjournald+0xc1/0x1fb [jbd]
> [Mon Sep 21 22:11:34 2009] [] 
> autoremove_wake_function+0x0/0x2e
> [Mon Sep 21 22:11:34 2009] [] kjournald+0x0/0x1fb [jbd]
> [Mon Sep 21 22:11:34 2009] [] kthread+0x47/0x73
> [Mon Sep 21 22:11:34 2009] [] child_rip+0xa/0x11
> [Mon Sep 21 22:11:34 2009] [] kthread+0x0/0x73
> [Mon Sep 21 22:11:34 2009] [] child_rip+0x0/0x11
> [Mon Sep 21 22:11:34 2009]
> [Mon Sep 21 22:11:34 2009]Kernel panic - not syncing: softlockup: blocked 
> tasks
> [-- r...@localhost.localdomain attached -- Mon Sep 21 22:17:24 2009]
> [-- r...@localhost.localdomain detached -- Mon Sep 21 23:16:22 2009]
> [-- Console down -- Tue Sep 22 04:02:15 2009]

That is unfortunate.

> The two systems are hardware and OS identical:
> 
> [r...@client-nfs5 log]# uname -a
> Linux client-nfs5 2.6.27.25 #2 SMP Fri Jun 26 00:07:23 EDT 2009 x86_64 x86_64 
> x86_64 GNU/Linux
> 
> [r...@client-nfs5 log]# grep model\ name /proc/cpuinfo 
> model name: Intel(R) Xeon(R) CPU   E5520  @ 2.27GHz
> m