Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-09-09 Thread Lukas Kolbe
Am Donnerstag, den 09.09.2010, 04:23 +0100 schrieb Ben Hutchings:
> On Tue, 2010-09-07 at 13:25 +0200, Lukas Kolbe wrote:
> > On Wed, 2010-09-01 at 05:26 +0100, Ben Hutchings wrote:
> > > On Tue, 2010-08-31 at 17:34 +0200, Lukas Kolbe wrote:
> > > > On Tue, 2010-08-31 at 06:35 -0700, Greg KH wrote:
> > > [...]
> > > > > Then how about convincing the Debian kernel developers to accept these
> > > > > patches, and work through any regressions that might be found and 
> > > > > after
> > > > > that, reporting back to us?
> > > > 
> > > > Ben?
> > > > 
> > > > The reason I contacted you was precisely because it went into 2.6.33.2,
> > > > e.g. was already accepted into a -stalbe release. I didn't expect it to
> > > > be such an issue.
> > > 
> > > That's not likely if people spread FUD about the backlog patches!
> > > 
> > > Dave, did you explicitly exclude these patches from 2.6.32 when you
> > > submitted them to stable, or is it just that 5534979 "udp: use limited
> > > socket backlog" depends on a1ab77f "ipv6: udp: Optimise multicast
> > > reception"?  The former patch doesn't look too hard to backport to
> > > 2.6.32 (see below).
> > 
> > Anybody?
> > We've currently rolled out our own 2.6.32 kernel with these fixes
> > applied, and they indeed fix a system crash under our nfs-load. What
> > else can I do to get these fixes into either Debians' 2.6.32 or Greg's
> > stable 2.6.32 series?
> [...]
> 
> These patches will be included in Debian's version 2.6.32-22.  We'll see
> how that goes.

I owe you a few beers. Thanks a million!

> Ben.

Lukas







-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-09-08 Thread Ben Hutchings
On Tue, 2010-09-07 at 13:25 +0200, Lukas Kolbe wrote:
> On Wed, 2010-09-01 at 05:26 +0100, Ben Hutchings wrote:
> > On Tue, 2010-08-31 at 17:34 +0200, Lukas Kolbe wrote:
> > > On Tue, 2010-08-31 at 06:35 -0700, Greg KH wrote:
> > [...]
> > > > Then how about convincing the Debian kernel developers to accept these
> > > > patches, and work through any regressions that might be found and after
> > > > that, reporting back to us?
> > > 
> > > Ben?
> > > 
> > > The reason I contacted you was precisely because it went into 2.6.33.2,
> > > e.g. was already accepted into a -stalbe release. I didn't expect it to
> > > be such an issue.
> > 
> > That's not likely if people spread FUD about the backlog patches!
> > 
> > Dave, did you explicitly exclude these patches from 2.6.32 when you
> > submitted them to stable, or is it just that 5534979 "udp: use limited
> > socket backlog" depends on a1ab77f "ipv6: udp: Optimise multicast
> > reception"?  The former patch doesn't look too hard to backport to
> > 2.6.32 (see below).
> 
> Anybody?
> We've currently rolled out our own 2.6.32 kernel with these fixes
> applied, and they indeed fix a system crash under our nfs-load. What
> else can I do to get these fixes into either Debians' 2.6.32 or Greg's
> stable 2.6.32 series?
[...]

These patches will be included in Debian's version 2.6.32-22.  We'll see
how that goes.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part


Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-09-07 Thread Lukas Kolbe
On Wed, 2010-09-01 at 05:26 +0100, Ben Hutchings wrote:
> On Tue, 2010-08-31 at 17:34 +0200, Lukas Kolbe wrote:
> > On Tue, 2010-08-31 at 06:35 -0700, Greg KH wrote:
> [...]
> > > Then how about convincing the Debian kernel developers to accept these
> > > patches, and work through any regressions that might be found and after
> > > that, reporting back to us?
> > 
> > Ben?
> > 
> > The reason I contacted you was precisely because it went into 2.6.33.2,
> > e.g. was already accepted into a -stalbe release. I didn't expect it to
> > be such an issue.
> 
> That's not likely if people spread FUD about the backlog patches!
> 
> Dave, did you explicitly exclude these patches from 2.6.32 when you
> submitted them to stable, or is it just that 5534979 "udp: use limited
> socket backlog" depends on a1ab77f "ipv6: udp: Optimise multicast
> reception"?  The former patch doesn't look too hard to backport to
> 2.6.32 (see below).

Anybody?
We've currently rolled out our own 2.6.32 kernel with these fixes
applied, and they indeed fix a system crash under our nfs-load. What
else can I do to get these fixes into either Debians' 2.6.32 or Greg's
stable 2.6.32 series?

> Ben.
> 
> From: Zhu Yi 
> Date: Thu, 4 Mar 2010 18:01:42 +
> Subject: [PATCH] udp: use limited socket backlog
> 
> [ Upstream commit 55349790d7cbf0d381873a7ece1dcafcffd4aaa9 ]
> 
> Make udp adapt to the limited socket backlog change.
> 
> Cc: "David S. Miller" 
> Cc: Alexey Kuznetsov 
> Cc: "Pekka Savola (ipv6)" 
> Cc: Patrick McHardy 
> Signed-off-by: Zhu Yi 
> Acked-by: Eric Dumazet 
> Signed-off-by: David S. Miller 
> Signed-off-by: Greg Kroah-Hartman 
> [bwh: Backport to 2.6.32]


Regards,
Lukas





-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-31 Thread Ben Hutchings
On Tue, 2010-08-31 at 17:34 +0200, Lukas Kolbe wrote:
> On Tue, 2010-08-31 at 06:35 -0700, Greg KH wrote:
[...]
> > Then how about convincing the Debian kernel developers to accept these
> > patches, and work through any regressions that might be found and after
> > that, reporting back to us?
> 
> Ben?
> 
> The reason I contacted you was precisely because it went into 2.6.33.2,
> e.g. was already accepted into a -stalbe release. I didn't expect it to
> be such an issue.

That's not likely if people spread FUD about the backlog patches!

Dave, did you explicitly exclude these patches from 2.6.32 when you
submitted them to stable, or is it just that 5534979 "udp: use limited
socket backlog" depends on a1ab77f "ipv6: udp: Optimise multicast
reception"?  The former patch doesn't look too hard to backport to
2.6.32 (see below).

Ben.

From: Zhu Yi 
Date: Thu, 4 Mar 2010 18:01:42 +
Subject: [PATCH] udp: use limited socket backlog

[ Upstream commit 55349790d7cbf0d381873a7ece1dcafcffd4aaa9 ]

Make udp adapt to the limited socket backlog change.

Cc: "David S. Miller" 
Cc: Alexey Kuznetsov 
Cc: "Pekka Savola (ipv6)" 
Cc: Patrick McHardy 
Signed-off-by: Zhu Yi 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
[bwh: Backport to 2.6.32]
---
 net/ipv4/udp.c |6 --
 net/ipv6/udp.c |   20 ++--
 2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index c322f44..0ea57b1 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1174,8 +1174,10 @@ int udp_queue_rcv_skb(struct sock *sk, struct sk_buff 
*skb)
bh_lock_sock(sk);
if (!sock_owned_by_user(sk))
rc = __udp_queue_rcv_skb(sk, skb);
-   else
-   sk_add_backlog(sk, skb);
+   else if (sk_add_backlog_limited(sk, skb)) {
+   bh_unlock_sock(sk);
+   goto drop;
+   }
bh_unlock_sock(sk);
 
return rc;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index cf538ed..154dd6b 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -470,16 +470,20 @@ static int __udp6_lib_mcast_deliver(struct net *net, 
struct sk_buff *skb,
bh_lock_sock(sk2);
if (!sock_owned_by_user(sk2))
udpv6_queue_rcv_skb(sk2, buff);
-   else
-   sk_add_backlog(sk2, buff);
+   else if (sk_add_backlog_limited(sk2, buff)) {
+   atomic_inc(&sk2->sk_drops);
+   kfree_skb(buff);
+   }
bh_unlock_sock(sk2);
}
}
bh_lock_sock(sk);
if (!sock_owned_by_user(sk))
udpv6_queue_rcv_skb(sk, skb);
-   else
-   sk_add_backlog(sk, skb);
+   else if (sk_add_backlog_limited(sk, skb)) {
+   atomic_inc(&sk->sk_drops);
+   kfree_skb(skb);
+   }
bh_unlock_sock(sk);
 out:
spin_unlock(&hslot->lock);
@@ -598,8 +602,12 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table 
*udptable,
bh_lock_sock(sk);
if (!sock_owned_by_user(sk))
udpv6_queue_rcv_skb(sk, skb);
-   else
-   sk_add_backlog(sk, skb);
+   else if (sk_add_backlog_limited(sk, skb)) {
+   atomic_inc(&sk->sk_drops);
+   bh_unlock_sock(sk);
+   sock_put(sk);
+   goto discard;
+   }
bh_unlock_sock(sk);
sock_put(sk);
return 0;
-- 
1.7.1

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part


Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-31 Thread Lukas Kolbe
On Tue, 2010-08-31 at 06:35 -0700, Greg KH wrote:
> On Tue, Aug 31, 2010 at 10:16:56AM +0200, Lukas Kolbe wrote:
> > Am Montag, den 30.08.2010, 10:21 -0700 schrieb Greg KH:
> > > On Mon, Aug 30, 2010 at 09:46:36AM -0700, David Miller wrote:
> > > > From: Greg KH 
> > > > Date: Mon, 30 Aug 2010 07:50:17 -0700
> > > > 
> > > > > As I stated above, I need the ACK from David to be able to add these
> > > > > patches.
> > > > > 
> > > > > David?
> > > > 
> > > > I believe there were some regressions caused by these changes that were
> > > > fixed later, a bit after those commites went into the tree.
> > > > 
> > > > I'm only confortable ACK'ing this if someone does due diligence and
> > > > checks for any such follow-on fixes to that series.
> > > > 
> > > > It's a pretty non-trivial set of patches and has the potential to kill
> > > > performance which would be a very serious regression.
> > > 
> > > Fair enough.
> > 
> > Yep, thanks! 
> > 
> > > Who's done the checks to find out any problems with these patches?
> > 
> > I'll skim the changelogs in 2.6.3[345].x to see if there are any related
> > patches.
> >  
> > > And what is keeping you from moving to the .35 kernel tree instead?
> > 
> > Basically, distribution support. Debian Squeeze will ship with 2.6.32,
> > as Ubuntu already did for their current LTS - and I really want Debian's
> > kernel to be as reliable and stable as possible (btw, that's why I
> > initally reported this as a debian bug, because at that time I wasn't
> > using vanilla kernels. Now that I know how git bisect works, it will
> > hopefully be easier for me to pinpoint regressions in the future).
> > 
> > Also, We do not really have enough hardware to test new upstream
> > releases thouroughly before going into production (e.g., we only have
> > one big tape library with one big disk pool, so no test if
> > tape-on-mptsas and aacraid work properly and stable in a newer upstream
> > releases.
> 
> Then how about convincing the Debian kernel developers to accept these
> patches, and work through any regressions that might be found and after
> that, reporting back to us?

Ben?

The reason I contacted you was precisely because it went into 2.6.33.2,
e.g. was already accepted into a -stalbe release. I didn't expect it to
be such an issue.

> thanks,
> 
> greg k-h

Regards,
Lukas





-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-31 Thread Greg KH
On Tue, Aug 31, 2010 at 10:16:56AM +0200, Lukas Kolbe wrote:
> Am Montag, den 30.08.2010, 10:21 -0700 schrieb Greg KH:
> > On Mon, Aug 30, 2010 at 09:46:36AM -0700, David Miller wrote:
> > > From: Greg KH 
> > > Date: Mon, 30 Aug 2010 07:50:17 -0700
> > > 
> > > > As I stated above, I need the ACK from David to be able to add these
> > > > patches.
> > > > 
> > > > David?
> > > 
> > > I believe there were some regressions caused by these changes that were
> > > fixed later, a bit after those commites went into the tree.
> > > 
> > > I'm only confortable ACK'ing this if someone does due diligence and
> > > checks for any such follow-on fixes to that series.
> > > 
> > > It's a pretty non-trivial set of patches and has the potential to kill
> > > performance which would be a very serious regression.
> > 
> > Fair enough.
> 
> Yep, thanks! 
> 
> > Who's done the checks to find out any problems with these patches?
> 
> I'll skim the changelogs in 2.6.3[345].x to see if there are any related
> patches.
>  
> > And what is keeping you from moving to the .35 kernel tree instead?
> 
> Basically, distribution support. Debian Squeeze will ship with 2.6.32,
> as Ubuntu already did for their current LTS - and I really want Debian's
> kernel to be as reliable and stable as possible (btw, that's why I
> initally reported this as a debian bug, because at that time I wasn't
> using vanilla kernels. Now that I know how git bisect works, it will
> hopefully be easier for me to pinpoint regressions in the future).
> 
> Also, We do not really have enough hardware to test new upstream
> releases thouroughly before going into production (e.g., we only have
> one big tape library with one big disk pool, so no test if
> tape-on-mptsas and aacraid work properly and stable in a newer upstream
> releases.

Then how about convincing the Debian kernel developers to accept these
patches, and work through any regressions that might be found and after
that, reporting back to us?

thanks,

greg k-h



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-31 Thread Lukas Kolbe

> > Who's done the checks to find out any problems with these patches?
> 
> I'll skim the changelogs in 2.6.3[345].x to see if there are any related
> patches.

This is all I could find in current 2.6.36-rc2 (via git log | grep,
minus rps/rfs patches). I don't know anything about these, but they
sound related. If anybody with insight into the actual codebase could
take a look there ...


commit c377411f2494a931ff7facdbb3a6839b1266bcf6
Author: Eric Dumazet 
Date:   Tue Apr 27 15:13:20 2010 -0700

net: sk_add_backlog() take rmem_alloc into account

Current socket backlog limit is not enough to really stop DDOS attacks,
because user thread spend many time to process a full backlog each
round, and user might crazy spin on socket lock.

We should add backlog size and receive_queue size (aka rmem_alloc) to
pace writers, and let user run without being slow down too much.

Introduce a sk_rcvqueues_full() helper, to avoid taking socket lock in
stress situations.

Under huge stress from a multiqueue/RPS enabled NIC, a single flow udp
receiver can now process ~200.000 pps (instead of ~100 pps before the
patch) on a 8 core machine.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 

commit 6cce09f87a04797fae5b947ef2626c14a78f0b49
Author: Eric Dumazet 
Date:   Sun Mar 7 23:21:57 2010 +

tcp: Add SNMP counters for backlog and min_ttl drops

Commit 6b03a53a (tcp: use limited socket backlog) added the possibility
of dropping frames when backlog queue is full.

Commit d218d111 (tcp: Generalized TTL Security Mechanism) added the
possibility of dropping frames when TTL is under a given limit.

This patch adds new SNMP MIB entries, named TCPBacklogDrop and
TCPMinTTLDrop, published in /proc/net/netstat in TcpExt: line

netstat -s | egrep "TCPBacklogDrop|TCPMinTTLDrop"
TCPBacklogDrop: 0
TCPMinTTLDrop: 0

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 

commit 4045635318538d3ddd2007720412fdc4b08f6a62
Author: Zhu Yi 
Date:   Sun Mar 7 16:21:39 2010 +

net: add __must_check to sk_add_backlog

Add the "__must_check" tag to sk_add_backlog() so that any failure to
check and drop packets will be warned about.

Signed-off-by: Zhu Yi 
Signed-off-by: David S. Miller 

commit b1faf5666438090a4dc4fceac8502edc7788b7e3
Author: Eric Dumazet 
Date:   Mon May 31 23:44:05 2010 -0700

net: sock_queue_err_skb() dont mess with sk_forward_alloc

Correct sk_forward_alloc handling for error_queue would need to use a
backlog of frames that softirq handler could not deliver because socket
is owned by user thread. Or extend backlog processing to be able to
process normal and error packets.

Another possibility is to not use mem charge for error queue, this is
what I implemented in this patch.

Note: this reverts commit 29030374
(net: fix sk_forward_alloc corruptions), since we dont need to lock
socket anymore.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 


commit dee42870a423ad485129f43cddfe7275479f11d8
Author: Changli Gao 
Date:   Sun May 2 05:42:16 2010 +

net: fix softnet_stat

Per cpu variable softnet_data.total was shared between IRQ and SoftIRQ 
context
without any protection. And enqueue_to_backlog should update the 
netdev_rx_stat
of the target CPU.

This patch renames softnet_data.total to softnet_data.processed: the number 
of
packets processed in uppper levels(IP stacks).

softnet_stat data is moved into softnet_data.

Signed-off-by: Changli Gao 

 include/linux/netdevice.h |   17 +++--
 net/core/dev.c|   26 --
 net/sched/sch_generic.c   |2 +-
 3 files changed, 20 insertions(+), 25 deletions(-)
Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 

commit 4b0b72f7dd617b13abd1b04c947e15873e011a24
Author: Eric Dumazet 
Date:   Wed Apr 28 14:35:48 2010 -0700

net: speedup udp receive path

Since commit 95766fff ([UDP]: Add memory accounting.),
each received packet needs one extra sock_lock()/sock_release() pair.

This added latency because of possible backlog handling. Then later,
ticket spinlocks added yet another latency source in case of DDOS.

This patch introduces lock_sock_bh() and unlock_sock_bh()
synchronization primitives, avoiding one atomic operation and backlog
processing.

skb_free_datagram_locked() uses them instead of full blown
lock_sock()/release_sock(). skb is orphaned inside locked section for
proper socket memory reclaim, and finally freed outside of it.

UDP receive path now take the socket spinlock only once.

Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 

commit 6e7676c1a76aed6e957611d8d7a9e5592e23aeba
Aut

Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-31 Thread Lukas Kolbe
Am Montag, den 30.08.2010, 10:21 -0700 schrieb Greg KH:
> On Mon, Aug 30, 2010 at 09:46:36AM -0700, David Miller wrote:
> > From: Greg KH 
> > Date: Mon, 30 Aug 2010 07:50:17 -0700
> > 
> > > As I stated above, I need the ACK from David to be able to add these
> > > patches.
> > > 
> > > David?
> > 
> > I believe there were some regressions caused by these changes that were
> > fixed later, a bit after those commites went into the tree.
> > 
> > I'm only confortable ACK'ing this if someone does due diligence and
> > checks for any such follow-on fixes to that series.
> > 
> > It's a pretty non-trivial set of patches and has the potential to kill
> > performance which would be a very serious regression.
> 
> Fair enough.

Yep, thanks! 

> Who's done the checks to find out any problems with these patches?

I'll skim the changelogs in 2.6.3[345].x to see if there are any related
patches.
 
> And what is keeping you from moving to the .35 kernel tree instead?

Basically, distribution support. Debian Squeeze will ship with 2.6.32,
as Ubuntu already did for their current LTS - and I really want Debian's
kernel to be as reliable and stable as possible (btw, that's why I
initally reported this as a debian bug, because at that time I wasn't
using vanilla kernels. Now that I know how git bisect works, it will
hopefully be easier for me to pinpoint regressions in the future).

Also, We do not really have enough hardware to test new upstream
releases thouroughly before going into production (e.g., we only have
one big tape library with one big disk pool, so no test if
tape-on-mptsas and aacraid work properly and stable in a newer upstream
releases.

Kind regards,
Lukas





-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-30 Thread Greg KH
On Mon, Aug 30, 2010 at 09:46:36AM -0700, David Miller wrote:
> From: Greg KH 
> Date: Mon, 30 Aug 2010 07:50:17 -0700
> 
> > As I stated above, I need the ACK from David to be able to add these
> > patches.
> > 
> > David?
> 
> I believe there were some regressions caused by these changes that were
> fixed later, a bit after those commites went into the tree.
> 
> I'm only confortable ACK'ing this if someone does due diligence and
> checks for any such follow-on fixes to that series.
> 
> It's a pretty non-trivial set of patches and has the potential to kill
> performance which would be a very serious regression.

Fair enough.

Who's done the checks to find out any problems with these patches?

And what is keeping you from moving to the .35 kernel tree instead?

thanks,

greg k-h



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-30 Thread David Miller
From: Greg KH 
Date: Mon, 30 Aug 2010 07:50:17 -0700

> As I stated above, I need the ACK from David to be able to add these
> patches.
> 
> David?

I believe there were some regressions caused by these changes that were
fixed later, a bit after those commites went into the tree.

I'm only confortable ACK'ing this if someone does due diligence and
checks for any such follow-on fixes to that series.

It's a pretty non-trivial set of patches and has the potential to kill
performance which would be a very serious regression.



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-30 Thread Greg KH
On Mon, Aug 30, 2010 at 03:59:57PM +0200, Lukas Kolbe wrote:
> On Thu, 2010-08-26 at 09:32 +0200, Lukas Kolbe wrote:
> 
> Hi,
> 
> > > > I was finally able to identify the patch series that introduced the fix
> > > > (they were introduced to -stable in 2.6.33.2):
> > > > 
> > > > cb63112 net: add __must_check to sk_add_backlog
> > > > a12a9a2 net: backlog functions rename
> > > > 51c5db4 x25: use limited socket backlog
> > > > c531ab2 tipc: use limited socket backlog
> > > > 37d60aa sctp: use limited socket backlog
> > > > 9b3d968 llc: use limited socket backlog
> > > > 230401e udp: use limited socket backlog
> > > > 20a92ec tcp: use limited socket backlog
> > > > ab9dd05 net: add limit for socket backlog
> > > > 
> > > > After applying these to 2.6.32.17, I wasn't able to trigger the failure
> > > > anymore.
> > > 
> > > What "failure"?
> > > 
> > > > 230401e didn't apply cleanly with git cherry-pick on top of 2.6.32.17,
> > > > so there might be some additional work needed.
> > > > 
> > > > @Greg: would it be possible to have these fixes in the next 2.6.32? See
> > > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=592187#69 for details:
> > > > they fix a guest network crash during heavy nfs-io using virtio.
> > > 
> > > These are a lot of patches, looking like they are adding a new feature.
> > > I would need to get the ack of the network maintainer before I can add
> > > them.
> > > 
> > > David?
> > 
> > I don't mean to nag (hm well, maybe I do) and I know you were busy
> > preparing the guard-page fixes, but what's the status of this? In the
> > meantime, we triggered this bug also on barebone hardware using nfs over
> > tcp with default [rw]sizes of about 1MiB. On the real hardware, the
> > kernel oopsed, not only the network stack ...
> > 
> > With these patches applied, everything works smoothly. I'd really love
> > to see a stable 2.6.32 ... 
> 
> Is there anything I can do to help reaching a decision with this issue?

As I stated above, I need the ACK from David to be able to add these
patches.

David?

thanks,

greg k-h



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-30 Thread Lukas Kolbe
On Thu, 2010-08-26 at 09:32 +0200, Lukas Kolbe wrote:

Hi,

> > > I was finally able to identify the patch series that introduced the fix
> > > (they were introduced to -stable in 2.6.33.2):
> > > 
> > > cb63112 net: add __must_check to sk_add_backlog
> > > a12a9a2 net: backlog functions rename
> > > 51c5db4 x25: use limited socket backlog
> > > c531ab2 tipc: use limited socket backlog
> > > 37d60aa sctp: use limited socket backlog
> > > 9b3d968 llc: use limited socket backlog
> > > 230401e udp: use limited socket backlog
> > > 20a92ec tcp: use limited socket backlog
> > > ab9dd05 net: add limit for socket backlog
> > > 
> > > After applying these to 2.6.32.17, I wasn't able to trigger the failure
> > > anymore.
> > 
> > What "failure"?
> > 
> > > 230401e didn't apply cleanly with git cherry-pick on top of 2.6.32.17,
> > > so there might be some additional work needed.
> > > 
> > > @Greg: would it be possible to have these fixes in the next 2.6.32? See
> > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=592187#69 for details:
> > > they fix a guest network crash during heavy nfs-io using virtio.
> > 
> > These are a lot of patches, looking like they are adding a new feature.
> > I would need to get the ack of the network maintainer before I can add
> > them.
> > 
> > David?
> 
> I don't mean to nag (hm well, maybe I do) and I know you were busy
> preparing the guard-page fixes, but what's the status of this? In the
> meantime, we triggered this bug also on barebone hardware using nfs over
> tcp with default [rw]sizes of about 1MiB. On the real hardware, the
> kernel oopsed, not only the network stack ...
> 
> With these patches applied, everything works smoothly. I'd really love
> to see a stable 2.6.32 ... 

Is there anything I can do to help reaching a decision with this issue?

Regards,
Lukas





-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-26 Thread Lukas Kolbe
Hi all,

> > I was finally able to identify the patch series that introduced the fix
> > (they were introduced to -stable in 2.6.33.2):
> > 
> > cb63112 net: add __must_check to sk_add_backlog
> > a12a9a2 net: backlog functions rename
> > 51c5db4 x25: use limited socket backlog
> > c531ab2 tipc: use limited socket backlog
> > 37d60aa sctp: use limited socket backlog
> > 9b3d968 llc: use limited socket backlog
> > 230401e udp: use limited socket backlog
> > 20a92ec tcp: use limited socket backlog
> > ab9dd05 net: add limit for socket backlog
> > 
> > After applying these to 2.6.32.17, I wasn't able to trigger the failure
> > anymore.
> 
> What "failure"?
> 
> > 230401e didn't apply cleanly with git cherry-pick on top of 2.6.32.17,
> > so there might be some additional work needed.
> > 
> > @Greg: would it be possible to have these fixes in the next 2.6.32? See
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=592187#69 for details:
> > they fix a guest network crash during heavy nfs-io using virtio.
> 
> These are a lot of patches, looking like they are adding a new feature.
> I would need to get the ack of the network maintainer before I can add
> them.
> 
> David?

I don't mean to nag (hm well, maybe I do) and I know you were busy
preparing the guard-page fixes, but what's the status of this? In the
meantime, we triggered this bug also on barebone hardware using nfs over
tcp with default [rw]sizes of about 1MiB. On the real hardware, the
kernel oopsed, not only the network stack ...

With these patches applied, everything works smoothly. I'd really love
to see a stable 2.6.32 ... 

> thanks,
> 
> greg k-h
> 

Regards,
Lukas Kolbe






-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-20 Thread Lukas Kolbe
Hi all,
 
> > I was finally able to identify the patch series that introduced the fix
> > (they were introduced to -stable in 2.6.33.2):
> > 
> > cb63112 net: add __must_check to sk_add_backlog
> > a12a9a2 net: backlog functions rename
> > 51c5db4 x25: use limited socket backlog
> > c531ab2 tipc: use limited socket backlog
> > 37d60aa sctp: use limited socket backlog
> > 9b3d968 llc: use limited socket backlog
> > 230401e udp: use limited socket backlog
> > 20a92ec tcp: use limited socket backlog
> > ab9dd05 net: add limit for socket backlog
> > 
> > After applying these to 2.6.32.17, I wasn't able to trigger the failure
> > anymore.
> 
> What "failure"?

>From my other mail, for public reference:

> With 2.6.32.17 as a KVM guest using virtio_net, large nfs reads and
> writes cause the network to crash. Only rmmod virtio_net/modprobe
> virtio_net fixes it. I found that this bug was fixed in 2.6.33.2, and
> git bisect pointed me to the following patch series, which, when
applied
> to 2.6.32.17, fixes the problem:

I have to add that this also happens to bare-bone systems on real
hardware - we just had a machine crash during it's nightly nfs backup
with a slew of page allocation failures.

> > 230401e didn't apply cleanly with git cherry-pick on top of 2.6.32.17,
> > so there might be some additional work needed.
> > 
> > @Greg: would it be possible to have these fixes in the next 2.6.32? See
> > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=592187#69 for details:
> > they fix a guest network crash during heavy nfs-io using virtio.
> 
> These are a lot of patches, looking like they are adding a new feature.
> I would need to get the ack of the network maintainer before I can add
> them.
> 
> David?

fyi, the comment of the original patch series applied to 2.6.33. It's
not a new feature per se, but a fix to a general problem. 

Author: Zhu Yi 
Date:   Thu Mar 4 18:01:40 2010 +

net: add limit for socket backlog

[ Upstream commit 8eae939f1400326b06d0c9afe53d2a484a326871 ]

We got system OOM while running some UDP netperf testing on the loopback
device. The case is multiple senders sent stream UDP packets to a single
receiver via loopback on local host. Of course, the receiver is not able
to handle all the packets in time. But we surprisingly found that these
packets were not discarded due to the receiver's sk->sk_rcvbuf limit.
Instead, they are kept queuing to sk->sk_backlog and finally ate up all
the memory. We believe this is a secure hole that a none privileged user
can crash the system.

The root cause for this problem is, when the receiver is doing
__release_sock() (i.e. after userspace recv, kernel udp_recvmsg ->
skb_free_datagram_locked -> release_sock), it moves skbs from backlog to
sk_receive_queue with the softirq enabled. In the above case, multiple
busy senders will almost make it an endless loop. The skbs in the
backlog end up eat all the system memory.

The issue is not only for UDP. Any protocols using socket backlog is
potentially affected. The patch adds limit for socket backlog so that
the backlog size cannot be expanded endlessly.

Reported-by: Alex Shi 
Cc: David Miller 
Cc: Arnaldo Carvalho de Melo 
Cc: Alexey Kuznetsov 
Cc: "Pekka Savola (ipv6)" 
Cc: Patrick McHardy 
Cc: Vlad Yasevich 
Cc: Sridhar Samudrala 
Cc: Jon Maloy 
Cc: Allan Stephens 
Cc: Andrew Hendry 
Signed-off-by: Zhu Yi 
Signed-off-by: Eric Dumazet 
Acked-by: Arnaldo Carvalho de Melo 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 

> thanks,
> 
> greg k-h
> 

Regards,
Lukas





-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: [stable] Bug#576838: virtio network crashes again

2010-08-19 Thread Greg KH
On Sun, Aug 15, 2010 at 09:37:34AM +0200, Lukas Kolbe wrote:
> Hi Ben, Greg,
> 
> I was finally able to identify the patch series that introduced the fix
> (they were introduced to -stable in 2.6.33.2):
> 
> cb63112 net: add __must_check to sk_add_backlog
> a12a9a2 net: backlog functions rename
> 51c5db4 x25: use limited socket backlog
> c531ab2 tipc: use limited socket backlog
> 37d60aa sctp: use limited socket backlog
> 9b3d968 llc: use limited socket backlog
> 230401e udp: use limited socket backlog
> 20a92ec tcp: use limited socket backlog
> ab9dd05 net: add limit for socket backlog
> 
> After applying these to 2.6.32.17, I wasn't able to trigger the failure
> anymore.

What "failure"?

> 230401e didn't apply cleanly with git cherry-pick on top of 2.6.32.17,
> so there might be some additional work needed.
> 
> @Greg: would it be possible to have these fixes in the next 2.6.32? See
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=592187#69 for details:
> they fix a guest network crash during heavy nfs-io using virtio.

These are a lot of patches, looking like they are adding a new feature.
I would need to get the ack of the network maintainer before I can add
them.

David?

thanks,

greg k-h



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: Bug#576838: virtio network crashes again

2010-08-15 Thread Lukas Kolbe
Hi Ben, Greg,

I was finally able to identify the patch series that introduced the fix
(they were introduced to -stable in 2.6.33.2):

cb63112 net: add __must_check to sk_add_backlog
a12a9a2 net: backlog functions rename
51c5db4 x25: use limited socket backlog
c531ab2 tipc: use limited socket backlog
37d60aa sctp: use limited socket backlog
9b3d968 llc: use limited socket backlog
230401e udp: use limited socket backlog
20a92ec tcp: use limited socket backlog
ab9dd05 net: add limit for socket backlog

After applying these to 2.6.32.17, I wasn't able to trigger the failure
anymore.

230401e didn't apply cleanly with git cherry-pick on top of 2.6.32.17,
so there might be some additional work needed.

@Greg: would it be possible to have these fixes in the next 2.6.32? See
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=592187#69 for details:
they fix a guest network crash during heavy nfs-io using virtio.

Kind regards,
Lukas





-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: Bug#576838: virtio network crashes again

2010-08-11 Thread Lukas Kolbe
Am Mittwoch, den 11.08.2010, 04:13 +0100 schrieb Ben Hutchings:
> On Mon, 2010-08-09 at 11:24 +0200, Lukas Kolbe wrote:
> > So, testing begins.
> > 
> > First conclusion: not all traffic patterns produce the page allocation
> > failure. rdiff-backup only writing to an nfs-share does no harm;
> > rdiff-backup reading and writing (incremental backup) leads to (nearly
> > immediate) error.
> > 
> > The nfs-share is always mounted with proto=tcp and nfsv3; /proc/mount says:
> > fileserver.backup...:/export/backup/lbork /.cbackup-mp nfs 
> > rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=65535,timeo=600,retrans=2,sec=sys,mountport=65535,addr=x.x.x.x
> >  0 0
> [...]
> 
> I've seen some recent discussion of a bug in the Linux NFS client that
> can cause it to stop working entirely in case of some packet loss events
> .  It is possible
> that you are running into that bug.  I haven't yet seen an agreement on
> the fix for it.

Thanks, I'll look into it. I ran some further tests with vanilla and
debian kernels:

VERSION WORKING
---
2.6.35  yes
2.6.33.6yes
2.6.32.17   doesn't boot as kvm guest
2.6.32.17-2.6.32-19 no
2.6.32.17-2.6.32-18 no
2.6.32.16   no

I don't know if this is related to #16494 since I'm unable to trigger it
on 2.6.33.6 or 2.6.35. I'll test 2.6.32 with the patch from
http://lkml.org/lkml/2010/8/10/52 applied as well and bisect between
2.6.32.17 and 2.6.33.6 in the next few days.

> I also wonder whether the extremely large request sizes (rsize and
> wsize) you have selected are more likely to trigger the allocation
> failure in virtio_net.  Please can you test whether reducing them helps?

The large rsize/wsize were automatically chosen, but I'll test with a
failing kernel and [rw]size of 32768.

Kind regards,
Lukas





-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: Bug#576838: virtio network crashes again

2010-08-10 Thread Ben Hutchings
On Mon, 2010-08-09 at 11:24 +0200, Lukas Kolbe wrote:
> So, testing begins.
> 
> First conclusion: not all traffic patterns produce the page allocation
> failure. rdiff-backup only writing to an nfs-share does no harm;
> rdiff-backup reading and writing (incremental backup) leads to (nearly
> immediate) error.
> 
> The nfs-share is always mounted with proto=tcp and nfsv3; /proc/mount says:
> fileserver.backup...:/export/backup/lbork /.cbackup-mp nfs 
> rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=65535,timeo=600,retrans=2,sec=sys,mountport=65535,addr=x.x.x.x
>  0 0
[...]

I've seen some recent discussion of a bug in the Linux NFS client that
can cause it to stop working entirely in case of some packet loss events
.  It is possible
that you are running into that bug.  I haven't yet seen an agreement on
the fix for it.

I also wonder whether the extremely large request sizes (rsize and
wsize) you have selected are more likely to trigger the allocation
failure in virtio_net.  Please can you test whether reducing them helps?

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part


Bug#592187: Bug#576838: virtio network crashes again

2010-08-09 Thread Lukas Kolbe
Okay, next round: This time, 2.6.32-19 and virtio in guest, 2.6.32-18 in
the host and sadly, it's not fixed:

[  159.772700] rdiff-backup.bi: page allocation failure. order:0, mode:0x20
[  159.772708] Pid: 2524, comm: rdiff-backup.bi Not tainted 2.6.32-5-amd64 #1
[  159.772710] Call Trace:
[  159.772712][] ? __alloc_pages_nodemask+0x55b/0x5d0
[  159.772759]  [] ? __alloc_skb+0x69/0x15a
[  159.772779]  [] ? try_fill_recv+0x8b/0x18b [virtio_net]
[  159.772784]  [] ? virtnet_poll+0x543/0x5c9 [virtio_net]
[  159.772799]  [] ? net_rx_action+0xae/0x1c9
[  159.772817]  [] ? __do_softirq+0xdd/0x1a0
[  159.772829]  [] ? skb_recv_done+0x28/0x34 [virtio_net]
[  159.772838]  [] ? call_softirq+0x1c/0x30
[  159.772843]  [] ? do_softirq+0x3f/0x7c
[  159.772845]  [] ? irq_exit+0x36/0x76
[  159.772847]  [] ? do_IRQ+0xa0/0xb6
[  159.772850]  [] ? ret_from_intr+0x0/0x11
[  159.772851][] ? kmap_skb_frag+0x3/0x43
[  159.772856]  [] ? skb_checksum+0xfa/0x23f
[  159.772858]  [] ? __skb_checksum_complete_head+0x15/0x55
[  159.772868]  [] ? tcp_checksum_complete_user+0x1f/0x3c
[  159.772870]  [] ? tcp_rcv_established+0x3c5/0x6d9
[  159.772875]  [] ? tcp_v4_do_rcv+0x1bb/0x376
[  159.772877]  [] ? tcp_write_xmit+0x883/0x96c
[  159.772880]  [] ? release_sock+0x46/0x96
[  159.772882]  [] ? tcp_sendmsg+0x78a/0x87e
[  159.772885]  [] ? sock_sendmsg+0xa3/0xbb
[  159.772894]  [] ? autoremove_wake_function+0x0/0x2e
[  159.772902]  [] ? zone_statistics+0x3c/0x5d
[  159.772906]  [] ? pick_next_task_fair+0xcd/0xd8
[  159.772919]  [] ? kernel_sendmsg+0x32/0x3f
[  159.772943]  [] ? xs_send_kvec+0x78/0x7f [sunrpc]
[  159.772948]  [] ? xs_sendpages+0x89/0x1a1 [sunrpc]
[  159.772953]  [] ? xs_tcp_send_request+0x44/0x131 [sunrpc]
[  159.772961]  [] ? xprt_transmit+0x17b/0x25a [sunrpc]
[  159.772996]  [] ? nfs3_xdr_readargs+0x7a/0x89 [nfs]
[  159.773000]  [] ? call_transmit+0x1fb/0x246 [sunrpc]
[  159.773009]  [] ? __rpc_execute+0x7d/0x24d [sunrpc]
[  159.773032]  [] ? rpc_run_task+0x53/0x5b [sunrpc]
[  159.773042]  [] ? nfs_read_rpcsetup+0x1d2/0x1f4 [nfs]
[  159.773048]  [] ? readpage_async_filler+0x0/0xbf [nfs]
[  159.773061]  [] ? nfs_pageio_doio+0x2a/0x51 [nfs]
[  159.773067]  [] ? nfs_pageio_add_request+0xc5/0xd5 [nfs]
[  159.773072]  [] ? readpage_async_filler+0x7d/0xbf [nfs]
[  159.773076]  [] ? read_cache_pages+0x91/0x105
[  159.773082]  [] ? nfs_readpages+0x155/0x1b4 [nfs]
[  159.773087]  [] ? nfs_pagein_one+0x0/0xd0 [nfs]
[  159.773092]  [] ? finish_task_switch+0x3a/0xaf
[  159.773094]  [] ? __do_page_cache_readahead+0x11b/0x1b4
[  159.773097]  [] ? ra_submit+0x1c/0x20
[  159.773099]  [] ? page_cache_async_readahead+0x75/0xad
[  159.773109]  [] ? generic_file_aio_read+0x23a/0x52b
[  159.773118]  [] ? do_sync_read+0xce/0x113
[  159.773124]  [] ? __switch_to+0x285/0x297
[  159.773126]  [] ? autoremove_wake_function+0x0/0x2e
[  159.773129]  [] ? finish_task_switch+0x3a/0xaf
[  159.773131]  [] ? vfs_read+0xa6/0xff
[  159.773133]  [] ? sys_read+0x45/0x6e
[  159.773136]  [] ? system_call_fastpath+0x16/0x1b
[  159.773138] Mem-Info:
[  159.773139] Node 0 DMA per-cpu:
[  159.773141] CPU0: hi:0, btch:   1 usd:   0
[  159.773143] CPU1: hi:0, btch:   1 usd:   0
[  159.773144] Node 0 DMA32 per-cpu:
[  159.773146] CPU0: hi:  186, btch:  31 usd: 184
[  159.773147] CPU1: hi:  186, btch:  31 usd:  39
[  159.773151] active_anon:5153 inactive_anon:2765 isolated_anon:0
[  159.773152]  active_file:17029 inactive_file:65343 isolated_file:0
[  159.773153]  unevictable:0 dirty:8266 writeback:0 unstable:443
[  159.773154]  free:787 slab_reclaimable:25621 slab_unreclaimable:3017
[  159.773154]  mapped:1946 shmem:238 pagetables:921 bounce:0
[  159.773156] Node 0 DMA free:1992kB min:84kB low:104kB high:124kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:3276kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15372kB 
mlocked:0kB dirty:1232kB writeback:0kB mapped:0kB shmem:0kB 
slab_reclaimable:0kB slab_unreclaimable:60kB kernel_stack:0kB pagetables:0kB 
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  159.773164] lowmem_reserve[]: 0 489 489 489
[  159.773167] Node 0 DMA32 free:1156kB min:2784kB low:3480kB high:4176kB 
active_anon:20612kB inactive_anon:11060kB active_file:68116kB 
inactive_file:258096kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
present:500948kB mlocked:0kB dirty:31832kB writeback:0kB mapped:7784kB 
shmem:952kB slab_reclaimable:102484kB slab_unreclaimable:12008kB 
kernel_stack:1912kB pagetables:3684kB unstable:1772kB bounce:0kB 
writeback_tmp:0kB pages_scanned:128 all_unreclaimable? no
[  159.773175] lowmem_reserve[]: 0 0 0 0
[  159.773177] Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 1*64kB 1*128kB 1*256kB 
1*512kB 1*1024kB 0*2048kB 0*4096kB = 1992kB
[  159.773183] Node 0 DMA32: 159*4kB 3*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 
0*512kB 0*1024kB 0*2048kB 0*4096kB = 1156kB
[  159.773193] 82614 total pagecache pages
[  159.773194] 0 pages in swap ca

Bug#592187: Bug#576838: virtio network crashes again

2010-08-09 Thread Lukas Kolbe
So, testing begins.

First conclusion: not all traffic patterns produce the page allocation
failure. rdiff-backup only writing to an nfs-share does no harm;
rdiff-backup reading and writing (incremental backup) leads to (nearly
immediate) error.

The nfs-share is always mounted with proto=tcp and nfsv3; /proc/mount says:
fileserver.backup...:/export/backup/lbork /.cbackup-mp nfs 
rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=65535,timeo=600,retrans=2,sec=sys,mountport=65535,addr=x.x.x.x
 0 0


This is the result of 2.6.32-18 with virtio:

(/proc/meminfo within ten seconds of the page allocation failure, if
that helps)

MemTotal: 509072 kB
MemFree:   10356 kB
Buffers:4244 kB
Cached:   419996 kB
SwapCached:0 kB
Active:50856 kB
Inactive: 422424 kB
Active(anon):  24948 kB
Inactive(anon):25084 kB
Active(file):  25908 kB
Inactive(file):   397340 kB
Unevictable:   0 kB
Mlocked:   0 kB
SwapTotal:   4194296 kB
SwapFree:4194296 kB
Dirty:  5056 kB
Writeback: 0 kB
AnonPages: 49080 kB
Mapped: 7868 kB
Shmem:   952 kB
Slab:  11736 kB
SReclaimable:   5604 kB
SUnreclaim: 6132 kB
KernelStack:1920 kB
PageTables: 3728 kB
NFS_Unstable:  0 kB
Bounce:0 kB
WritebackTmp:  0 kB
CommitLimit: 4448832 kB
Committed_AS:1419384 kB
VmallocTotal:   34359738367 kB
VmallocUsed:5536 kB
VmallocChunk:   34359728048 kB
HardwareCorrupted: 0 kB
HugePages_Total:   0
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
DirectMap4k:8180 kB
DirectMap2M:  516096 kB
[  170.625928] rdiff-backup.bi: page allocation failure. order:0, mode:0x20
[  170.625934] Pid: 2398, comm: rdiff-backup.bi Not tainted 2.6.32-5-amd64 #1
[  170.625935] Call Trace:
[  170.625937][] ? __alloc_pages_nodemask+0x55b/0x5d0
[  170.625993]  [] ? __alloc_skb+0x69/0x15a
[  170.626002]  [] ? try_fill_recv+0x8b/0x18b [virtio_net]
[  170.626004]  [] ? virtnet_poll+0x543/0x5c9 [virtio_net]
[  170.626010]  [] ? net_rx_action+0xae/0x1c9
[  170.626032]  [] ? __do_softirq+0xdd/0x1a0
[  170.626035]  [] ? skb_recv_done+0x28/0x34 [virtio_net]
[  170.626044]  [] ? call_softirq+0x1c/0x30
[  170.626049]  [] ? do_softirq+0x3f/0x7c
[  170.626051]  [] ? irq_exit+0x36/0x76
[  170.626053]  [] ? do_IRQ+0xa0/0xb6
[  170.626061]  [] ? ret_from_intr+0x0/0x11
[  170.626062]   
[  170.626063] Mem-Info:
[  170.626065] Node 0 DMA per-cpu:
[  170.626072] CPU0: hi:0, btch:   1 usd:   0
[  170.626073] CPU1: hi:0, btch:   1 usd:   0
[  170.626074] Node 0 DMA32 per-cpu:
[  170.626076] CPU0: hi:  186, btch:  31 usd:  30
[  170.626078] CPU1: hi:  186, btch:  31 usd: 181
[  170.626082] active_anon:6237 inactive_anon:6271 isolated_anon:0
[  170.626083]  active_file:6476 inactive_file:100535 isolated_file:32
[  170.626084]  unevictable:0 dirty:1008 writeback:0 unstable:2050
[  170.626084]  free:729 slab_reclaimable:1401 slab_unreclaimable:1762
[  170.626085]  mapped:1967 shmem:238 pagetables:932 bounce:0
[  170.626087] Node 0 DMA free:1980kB min:84kB low:104kB high:124kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:13856kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15372kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:32kB 
slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  170.626099] lowmem_reserve[]: 0 489 489 489
[  170.626101] Node 0 DMA32 free:936kB min:2784kB low:3480kB high:4176kB 
active_anon:24948kB inactive_anon:25084kB active_file:25904kB 
inactive_file:388284kB unevictable:0kB isolated(anon):0kB isolated(file):128kB 
present:500948kB mlocked:0kB dirty:4032kB writeback:0kB mapped:7868kB 
shmem:952kB slab_reclaimable:5572kB slab_unreclaimable:7040kB 
kernel_stack:1912kB pagetables:3728kB unstable:8200kB bounce:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[  170.626110] lowmem_reserve[]: 0 0 0 0
[  170.626112] Node 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 
1*512kB 1*1024kB 0*2048kB 0*4096kB = 1976kB
[  170.626118] Node 0 DMA32: 0*4kB 1*8kB 0*16kB 1*32kB 0*64kB 1*128kB 1*256kB 
1*512kB 0*1024kB 0*2048kB 0*4096kB = 936kB
[  170.626125] 107278 total pagecache pages
[  170.626126] 0 pages in swap cache
[  170.626127] Swap cache stats: add 0, delete 0, find 0/0
[  170.626128] Free swap  = 4194296kB
[  170.626130] Total swap = 4194296kB
[  170.631675] 131069 pages RAM
[  170.631677] 3801 pages reserved
[  170.631678] 23548 pages shared
[  170.631679] 113310 pages non-shared

And later on another test run, this time the network went down with it
and the system didn't shut down properly anymore, /proc/meminfo again
within ten seconds of the page allocatio

Bug#592187: Bug#576838: virtio network crashes again

2010-08-09 Thread Lukas Kolbe
Hi Ben,

Am Sonntag, den 08.08.2010, 03:36 +0100 schrieb Ben Hutchings:
> This is not the same bug as was originally reported, which is that
> virtio_net failed to retry refilling its RX buffer ring.  That is
> definitely fixed.  So I'm treating this as a new bug report, #592187.

Okay, thanks. 

> > > I think you need to give your guests more memory.
> > 
> > They all have between 512M and 2G - and it happens to all of them using
> > virtio_net, and none of them using rtl8139 as a network driver,
> > reproducibly.
> 
> The RTL8139 hardware uses a single fixed RX DMA buffer.  The virtio
> 'hardware' allows the host to write into RX buffers anywhere in guest
> memory.  This results in very different allocation patterns.
> 
> Please try specifying 'e1000' hardware, i.e. an Intel gigabit
> controller.  I think the e1000 driver will have a similar allocation
> pattern to virtio_net, so you can see whether it also triggers
> allocation failures and a network stall in the guest.
> 
> Also, please test Linux 2.6.35 in the guest.  This is packaged in the
> 'experimental' suite.

I'll rig up a test machine (the crashes all occured on production
guests, unfortunatly) and report back. 

> [...]
> > If it would be an OOM situation, wouldn't the OOM-killer be supposed to
> > kick in?
> [...]
> 
> The log you sent shows failure to allocate memory in an 'atomic' context
> where there is no opportunity to wait for pages to be swapped out.  The
> OOM killer isn't triggered until the system is running out of memory
> despite swapping out pages.

Ah, good to know, thanks!

> Also, I note that following the failure of virtio_net to refill its RX
> buffer ring, I see failures to allocate buffers for sending TCP ACKs.
> So the guest drops the ACKs, and that TCP connection will stall
> temporarily (until the peer re-sends the unacknowledged packets).
> 
> I also see 'nfs: server fileserver.backup.TechFak.Uni-Bielefeld.DE not
> responding, still trying'.  This suggests that the allocation failure in
> virtio_net has resulted in dropping packets from the NFS server.  And it
> just makes matters worse as it becomes impossible to free memory by
> flushing out buffers over NFS!

This sounds quite bad. 

This problem *seems* to be fixed by 2.6.32-19: we upgraded to that on a
different machine for host and guests, and an rsync of ~1TiB of data
didn't produce any page allocation failures using virtio. But I'd wait
for my tests with rsync/nfs and 2.6.32-18+e1000, 2.6.32-18+virtio
2.6.32-19+virtio and 2.6.35+virtio to conclude that.

Thanks for taking your time to explain things!

-- 
Lukas





-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#592187: Bug#576838: virtio network crashes again

2010-08-07 Thread Ben Hutchings
This is not the same bug as was originally reported, which is that
virtio_net failed to retry refilling its RX buffer ring.  That is
definitely fixed.  So I'm treating this as a new bug report, #592187.

On Sat, 2010-08-07 at 18:17 +0200, Lukas Kolbe wrote:
> Am Samstag, den 07.08.2010, 12:18 +0100 schrieb Ben Hutchings:
> > On Sat, 2010-08-07 at 11:21 +0200, Lukas Kolbe wrote:
> > > Hi,
> > > 
> > > I sent this earlier today but the bug was archived so it didn't appear
> > > anywhere, hence the resend.
> > > 
> > > I believe this issue is not fixed at all in 2.6.32-18. We have seen this
> > > behaviour in various kvm guests using virtio_net with the same kernel in
> > > the guest only minutes after starting the nightly backup (rdiff-backup
> > > to an nfs-volume on a remote server), eventually leading to a
> > > non-functional network. Often, the machines even do not reboot and hang
> > > instead. Using the rtl8139 instead of virtio helps, but that's really
> > > only a clumsy workaround.
> > [...]
> > 
> > I think you need to give your guests more memory.
> 
> They all have between 512M and 2G - and it happens to all of them using
> virtio_net, and none of them using rtl8139 as a network driver,
> reproducibly.

The RTL8139 hardware uses a single fixed RX DMA buffer.  The virtio
'hardware' allows the host to write into RX buffers anywhere in guest
memory.  This results in very different allocation patterns.

Please try specifying 'e1000' hardware, i.e. an Intel gigabit
controller.  I think the e1000 driver will have a similar allocation
pattern to virtio_net, so you can see whether it also triggers
allocation failures and a network stall in the guest.

Also, please test Linux 2.6.35 in the guest.  This is packaged in the
'experimental' suite.

[...]
> If it would be an OOM situation, wouldn't the OOM-killer be supposed to
> kick in?
[...]

The log you sent shows failure to allocate memory in an 'atomic' context
where there is no opportunity to wait for pages to be swapped out.  The
OOM killer isn't triggered until the system is running out of memory
despite swapping out pages.

Also, I note that following the failure of virtio_net to refill its RX
buffer ring, I see failures to allocate buffers for sending TCP ACKs.
So the guest drops the ACKs, and that TCP connection will stall
temporarily (until the peer re-sends the unacknowledged packets).

I also see 'nfs: server fileserver.backup.TechFak.Uni-Bielefeld.DE not
responding, still trying'.  This suggests that the allocation failure in
virtio_net has resulted in dropping packets from the NFS server.  And it
just makes matters worse as it becomes impossible to free memory by
flushing out buffers over NFS!

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part


Bug#576838: virtio network crashes again

2010-08-07 Thread Lukas Kolbe
Am Samstag, den 07.08.2010, 12:18 +0100 schrieb Ben Hutchings:
> On Sat, 2010-08-07 at 11:21 +0200, Lukas Kolbe wrote:
> > Hi,
> > 
> > I sent this earlier today but the bug was archived so it didn't appear
> > anywhere, hence the resend.
> > 
> > I believe this issue is not fixed at all in 2.6.32-18. We have seen this
> > behaviour in various kvm guests using virtio_net with the same kernel in
> > the guest only minutes after starting the nightly backup (rdiff-backup
> > to an nfs-volume on a remote server), eventually leading to a
> > non-functional network. Often, the machines even do not reboot and hang
> > instead. Using the rtl8139 instead of virtio helps, but that's really
> > only a clumsy workaround.
> [...]
> 
> I think you need to give your guests more memory.

They all have between 512M and 2G - and it happens to all of them using
virtio_net, and none of them using rtl8139 as a network driver,
reproducibly. I would be delighted if it was as simple as giving them
more RAM, but sadly it isn't.

This is how we start the guests:

#!/bin/bash

KERNEL=2.6.32-5-amd64
NAME=tin

kvm -smp 2 \
 -drive if=virtio,file=/dev/system/tin_root,cache=off,boot=on \
 -drive if=virtio,file=/dev/system/tin_log,cache=off,boot=off \
 -drive if=virtio,file=/dev/system/tin_swap,cache=off,boot=off \
 -drive if=virtio,file=/dev/system/tin_data,cache=off,boot=off \
 -m 1024 \
 -nographic \
 -daemonize \
 -name ${NAME} \
 -kernel /boot/kvm/${NAME}/vmlinuz-${KERNEL} \
 -initrd /boot/kvm/${NAME}/initrd.img-${KERNEL} \
 -append "root=/dev/vda ro console=ttyS0,115200" \
 -serial mon:unix:/etc/kvm/consoles/${NAME}.sock,server,nowait \
 -net nic,macaddr=00:1A:4A:00:8E:3c,model=rtl8139 \
 -net tap,script=/etc/kvm/kvm-ifup-vlan142

Change model=rtl8139 to virtio, and the next time rdiff-backup runs, the
network stops working and eventually the guest hangs/can't be halted
anymore after a while.
qemu-kvm is version 0.12.4+dfsg-1, kernel is 2.6.32-18 on both host and
guest. And the page allocation failures look suspiciously similar to the
ones the original bug reporter saw when using 2.6.32-12.

If it would be an OOM situation, wouldn't the OOM-killer be supposed to
kick in?

/proc/meminfo on the host:
sajama:~# cat /proc/meminfo
MemTotal:8197652 kB
MemFree: 2444964 kB
Buffers:   13560 kB
Cached:   128812 kB
SwapCached: 6892 kB
Active:  5102584 kB
Inactive: 316616 kB
Active(anon):5035456 kB
Inactive(anon):   242180 kB
Active(file):  67128 kB
Inactive(file):74436 kB
Unevictable:   0 kB
Mlocked:   0 kB
SwapTotal:   8388600 kB
SwapFree:8355640 kB
Dirty: 8 kB
Writeback: 0 kB
AnonPages:   5271936 kB
Mapped: 5892 kB
Shmem:   804 kB
Slab:  79844 kB
SReclaimable:  21184 kB
SUnreclaim:58660 kB
KernelStack:1880 kB
PageTables:14256 kB
NFS_Unstable:  0 kB
Bounce:0 kB
WritebackTmp:  0 kB
CommitLimit:12487424 kB
Committed_AS:6440192 kB
VmallocTotal:   34359738367 kB
VmallocUsed:  305788 kB
VmallocChunk:   34359332988 kB
HardwareCorrupted: 0 kB
HugePages_Total:   0
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
DirectMap4k:7872 kB
DirectMap2M: 8380416 kB

/proc/meminfo on the guest (currently using rtl8139 as a network model):
lin...@tin:~$ cat /proc/meminfo 
MemTotal:1027200 kB
MemFree:   84336 kB
Buffers:   99588 kB
Cached:   152592 kB
SwapCached: 3160 kB
Active:   370304 kB
Inactive: 401924 kB
Active(anon): 264088 kB
Inactive(anon):   256724 kB
Active(file): 106216 kB
Inactive(file):   145200 kB
Unevictable:   0 kB
Mlocked:   0 kB
SwapTotal:   4194296 kB
SwapFree:4175892 kB
Dirty:16 kB
Writeback: 0 kB
AnonPages:517608 kB
Mapped:31348 kB
Shmem:   764 kB
Slab: 147396 kB
SReclaimable: 140440 kB
SUnreclaim: 6956 kB
KernelStack:1472 kB
PageTables: 9948 kB
NFS_Unstable:  0 kB
Bounce:0 kB
WritebackTmp:  0 kB
CommitLimit: 4707896 kB
Committed_AS: 893160 kB
VmallocTotal:   34359738367 kB
VmallocUsed:9096 kB
VmallocChunk:   34359724404 kB
HardwareCorrupted: 0 kB
HugePages_Total:   0
HugePages_Free:0
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
DirectMap4k:8180 kB
DirectMap2M: 1040384 kB


-- 
Lukas





-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#576838: virtio network crashes again

2010-08-07 Thread Ben Hutchings
On Sat, 2010-08-07 at 11:21 +0200, Lukas Kolbe wrote:
> Hi,
> 
> I sent this earlier today but the bug was archived so it didn't appear
> anywhere, hence the resend.
> 
> I believe this issue is not fixed at all in 2.6.32-18. We have seen this
> behaviour in various kvm guests using virtio_net with the same kernel in
> the guest only minutes after starting the nightly backup (rdiff-backup
> to an nfs-volume on a remote server), eventually leading to a
> non-functional network. Often, the machines even do not reboot and hang
> instead. Using the rtl8139 instead of virtio helps, but that's really
> only a clumsy workaround.
[...]

I think you need to give your guests more memory.

Ben.

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.


signature.asc
Description: This is a digitally signed message part