Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Tue, Nov 19, 2013 at 02:00:11PM -0800, Eric Dumazet wrote: On Tue, 2013-11-19 at 23:53 +0200, Michael S. Tsirkin wrote: Which NIC? Virtio? Prior to 2613af0ed18a11d5c566a81f9a6510b73180660a it didn't drop packets received from host as far as I can tell. virtio is more like a pipe than a real NIC in this respect. Prior/after to this patch, you were not posting buffers, so if packets were received on a physical NIC, you were dropping the packets anyway. It makes no difference at all, adding a cushion might make you feel better, but its really not worth it. Under memory stress, it makes better sense to drop a super big GRO packet (The one needing frag_list extension ...) It gives a better signal to the sender to reduce its pressure, and gives opportunity to free more of your memory. OK, but in that case one wonders whether we should do more to free memory? E.g. imagine that we dropped a packet of a specific TCP flow because we couldn't allocate a new packet. What happens now is that the old packet is freed as well. So quite likely the next packet in queue will get processed since it will reuse the memory we have just freed. The next packet and the next after it etc all will have to go through the net stack until they get at the socket and are dropped then because we missed a segment. Even worse, GRO gets disabled so the load on receiver goes up instead of down. Sounds like a problem doesn't it? GRO actually detects it's the same flow and can see packet is out of sequence. Why doesn't it drop the packet then? Alternatively, we could (for example using the pre-allocated skb like I suggested) notify GRO that it should start dropping packets of this flow. What do you think? -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Tue, Nov 19, 2013 at 01:38:16PM -0800, Michael Dalton wrote: Great catch Jason. I agree this now raises the larger issue of how to handle a memory alloc failure in the middle of receive. As Eric mentioned, we can drop the packet and free the remaining (num_buf) frags. Michael, perhaps I'm missing something, but why would you prefer pre-allocating buffers in this case? If the guest kernel is OOM'ing, dropping packets should provide backpressure. Also, we could just as easily fail the initial skb alloc in page_to_skb, and I think that case also needs to be handled now in the same fashion as a memory allocation failure in receive_mergeable. Best, Mike Yes I missed this last night. Thanks a lot Eric and Michael for pointing this out. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Wed, 2013-11-20 at 10:58 +0200, Michael S. Tsirkin wrote: On Tue, Nov 19, 2013 at 02:00:11PM -0800, Eric Dumazet wrote: On Tue, 2013-11-19 at 23:53 +0200, Michael S. Tsirkin wrote: Which NIC? Virtio? Prior to 2613af0ed18a11d5c566a81f9a6510b73180660a it didn't drop packets received from host as far as I can tell. virtio is more like a pipe than a real NIC in this respect. Prior/after to this patch, you were not posting buffers, so if packets were received on a physical NIC, you were dropping the packets anyway. It makes no difference at all, adding a cushion might make you feel better, but its really not worth it. Under memory stress, it makes better sense to drop a super big GRO packet (The one needing frag_list extension ...) It gives a better signal to the sender to reduce its pressure, and gives opportunity to free more of your memory. OK, but in that case one wonders whether we should do more to free memory? E.g. imagine that we dropped a packet of a specific TCP flow because we couldn't allocate a new packet. What happens now is that the old packet is freed as well. So quite likely the next packet in queue will get processed since it will reuse the memory we have just freed. The next packet and the next after it etc all will have to go through the net stack until they get at the socket and are dropped then because we missed a segment. Even worse, GRO gets disabled so the load on receiver goes up instead of down. Sounds like a problem doesn't it? I see no problem at all. GRO is a hint for high rates (and obviously when there is enough memory) GRO actually detects it's the same flow and can see packet is out of sequence. Why doesn't it drop the packet then? Alternatively, we could (for example using the pre-allocated skb like I suggested) notify GRO that it should start dropping packets of this flow. What do you think? I think we disagree a lot on memory management on networking stacks. We did a lot of work in TCP stack and Qdisc layers to lower memory pressure (and bufferbloat), an you seem to try hard to introduce yet another layer of buffer bloat in virtio_net. So add whatever you want to proudly state to your management : Look how smart we are : we drop no packets in our layer ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Wed, Nov 20, 2013 at 07:16:33AM -0800, Eric Dumazet wrote: On Wed, 2013-11-20 at 10:58 +0200, Michael S. Tsirkin wrote: On Tue, Nov 19, 2013 at 02:00:11PM -0800, Eric Dumazet wrote: On Tue, 2013-11-19 at 23:53 +0200, Michael S. Tsirkin wrote: Which NIC? Virtio? Prior to 2613af0ed18a11d5c566a81f9a6510b73180660a it didn't drop packets received from host as far as I can tell. virtio is more like a pipe than a real NIC in this respect. Prior/after to this patch, you were not posting buffers, so if packets were received on a physical NIC, you were dropping the packets anyway. It makes no difference at all, adding a cushion might make you feel better, but its really not worth it. Under memory stress, it makes better sense to drop a super big GRO packet (The one needing frag_list extension ...) It gives a better signal to the sender to reduce its pressure, and gives opportunity to free more of your memory. OK, but in that case one wonders whether we should do more to free memory? E.g. imagine that we dropped a packet of a specific TCP flow because we couldn't allocate a new packet. What happens now is that the old packet is freed as well. So quite likely the next packet in queue will get processed since it will reuse the memory we have just freed. The next packet and the next after it etc all will have to go through the net stack until they get at the socket and are dropped then because we missed a segment. Even worse, GRO gets disabled so the load on receiver goes up instead of down. Sounds like a problem doesn't it? I see no problem at all. GRO is a hint for high rates (and obviously when there is enough memory) GRO actually detects it's the same flow and can see packet is out of sequence. Why doesn't it drop the packet then? Alternatively, we could (for example using the pre-allocated skb like I suggested) notify GRO that it should start dropping packets of this flow. What do you think? I think we disagree a lot on memory management on networking stacks. We did a lot of work in TCP stack and Qdisc layers to lower memory pressure (and bufferbloat), an you seem to try hard to introduce yet another layer of buffer bloat in virtio_net. So add whatever you want to proudly state to your management : Look how smart we are : we drop no packets in our layer Hmm some kind of disconnect here. I got you rmanagement about bufferbloat. What I am saying is that maybe we should drop packets more aggressively: when we drop one packet of a flow, why not drop everything that's queued and is for the same flow? ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Wed, 2013-11-20 at 18:06 +0200, Michael S. Tsirkin wrote: Hmm some kind of disconnect here. I got you rmanagement about bufferbloat. What I am saying is that maybe we should drop packets more aggressively: when we drop one packet of a flow, why not drop everything that's queued and is for the same flow? I really hope your TCP flows use SACK ;) Please read the rfc 2018 introduction for details. If a packet is dropped, it doesn't mean following packets _have_ to be dropped. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Wed, Nov 20, 2013 at 08:14:21AM -0800, Eric Dumazet wrote: On Wed, 2013-11-20 at 18:06 +0200, Michael S. Tsirkin wrote: Hmm some kind of disconnect here. I got you rmanagement about bufferbloat. What I am saying is that maybe we should drop packets more aggressively: when we drop one packet of a flow, why not drop everything that's queued and is for the same flow? I really hope your TCP flows use SACK ;) Please read the rfc 2018 introduction for details. If a packet is dropped, it doesn't mean following packets _have_ to be dropped. Got it, thanks. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
We need to drop the refcnt of page when we fail to allocate an skb for frag list, otherwise it will be leaked. The bug was introduced by commit 2613af0ed18a11d5c566a81f9a6510b73180660a (virtio_net: migrate mergeable rx buffers to page frag allocators). Cc: Michael Dalton mwdal...@google.com Cc: Eric Dumazet eduma...@google.com Cc: Rusty Russell ru...@rustcorp.com.au Cc: Michael S. Tsirkin m...@redhat.com Signed-off-by: Jason Wang jasow...@redhat.com --- The patch was needed for 3.12 stable. --- drivers/net/virtio_net.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 69ad42b..3798517 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -322,9 +322,11 @@ static int receive_mergeable(struct receive_queue *rq, struct sk_buff *head_skb) head_skb-dev-name); len = MERGE_BUFFER_LEN; } + page = virt_to_head_page(buf); if (unlikely(num_skb_frags == MAX_SKB_FRAGS)) { struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC); if (unlikely(!nskb)) { + put_page(page); head_skb-dev-stats.rx_dropped++; return -ENOMEM; } @@ -341,7 +343,6 @@ static int receive_mergeable(struct receive_queue *rq, struct sk_buff *head_skb) head_skb-len += len; head_skb-truesize += MERGE_BUFFER_LEN; } - page = virt_to_head_page(buf); offset = buf - (char *)page_address(page); if (skb_can_coalesce(curr_skb, num_skb_frags, page, offset)) { put_page(page); -- 1.8.3.2 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Tue, 2013-11-19 at 16:05 +0800, Jason Wang wrote: We need to drop the refcnt of page when we fail to allocate an skb for frag list, otherwise it will be leaked. The bug was introduced by commit 2613af0ed18a11d5c566a81f9a6510b73180660a (virtio_net: migrate mergeable rx buffers to page frag allocators). Cc: Michael Dalton mwdal...@google.com Cc: Eric Dumazet eduma...@google.com Cc: Rusty Russell ru...@rustcorp.com.au Cc: Michael S. Tsirkin m...@redhat.com Signed-off-by: Jason Wang jasow...@redhat.com --- The patch was needed for 3.12 stable. Good catch, but if we return from receive_mergeable() in the 'middle' of the frags we would need for the current skb, who will call the virtqueue_get_buf() to flush the remaining frags ? Don't we also need to call virtqueue_get_buf() like while (--num_buf) { buf = virtqueue_get_buf(rq-vq, len); if (!buf) break; put_page(virt_to_head_page(buf)); } ? ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Tue, Nov 19, 2013 at 06:03:48AM -0800, Eric Dumazet wrote: On Tue, 2013-11-19 at 16:05 +0800, Jason Wang wrote: We need to drop the refcnt of page when we fail to allocate an skb for frag list, otherwise it will be leaked. The bug was introduced by commit 2613af0ed18a11d5c566a81f9a6510b73180660a (virtio_net: migrate mergeable rx buffers to page frag allocators). Cc: Michael Dalton mwdal...@google.com Cc: Eric Dumazet eduma...@google.com Cc: Rusty Russell ru...@rustcorp.com.au Cc: Michael S. Tsirkin m...@redhat.com Signed-off-by: Jason Wang jasow...@redhat.com --- The patch was needed for 3.12 stable. Good catch, but if we return from receive_mergeable() in the 'middle' of the frags we would need for the current skb, who will call the virtqueue_get_buf() to flush the remaining frags ? Don't we also need to call virtqueue_get_buf() like while (--num_buf) { buf = virtqueue_get_buf(rq-vq, len); if (!buf) break; put_page(virt_to_head_page(buf)); } ? virtqueue_get_buf only gives you back a buffer that has been DMA-ed to by hardware. ATM there's no way to get back a buffer once you gave it to hardware without doing a NIC reset. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Tue, Nov 19, 2013 at 06:03:48AM -0800, Eric Dumazet wrote: On Tue, 2013-11-19 at 16:05 +0800, Jason Wang wrote: We need to drop the refcnt of page when we fail to allocate an skb for frag list, otherwise it will be leaked. The bug was introduced by commit 2613af0ed18a11d5c566a81f9a6510b73180660a (virtio_net: migrate mergeable rx buffers to page frag allocators). Cc: Michael Dalton mwdal...@google.com Cc: Eric Dumazet eduma...@google.com Cc: Rusty Russell ru...@rustcorp.com.au Cc: Michael S. Tsirkin m...@redhat.com Signed-off-by: Jason Wang jasow...@redhat.com --- The patch was needed for 3.12 stable. Good catch, but if we return from receive_mergeable() in the 'middle' of the frags we would need for the current skb, who will call the virtqueue_get_buf() to flush the remaining frags ? Don't we also need to call virtqueue_get_buf() like while (--num_buf) { buf = virtqueue_get_buf(rq-vq, len); if (!buf) break; put_page(virt_to_head_page(buf)); } ? Let me explain what worries me in your suggestion: struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC); if (unlikely(!nskb)) { head_skb-dev-stats.rx_dropped++; return -ENOMEM; } is this the failure case we are talking about? I think this is a symprom of a larger problem introduced by 2613af0ed18a11d5c566a81f9a6510b73180660a, namely that we now need to allocate memory in the middle of processing a packet. I think discarding a completely valid and well-formed packet from the receive queue because we are unable to allocate new memory with GFP_ATOMIC for future packets is not a good idea. It certainly violates the principle of least surprize: when one sees host pass packet to guest, one expects the packet to get into the networking stack, not get dropped by the driver internally. Guest stack can do with the packet what it sees fit. We actually wake up a thread if we can't fill up the queue, that will fill it up in GFP_KERNEL context. So I think we should find a way to pre-allocate if necessary and avoid error paths where allocating new memory is a required to avoid drops. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Tue, 2013-11-19 at 22:49 +0200, Michael S. Tsirkin wrote: On Tue, Nov 19, 2013 at 06:03:48AM -0800, Eric Dumazet wrote: On Tue, 2013-11-19 at 16:05 +0800, Jason Wang wrote: We need to drop the refcnt of page when we fail to allocate an skb for frag list, otherwise it will be leaked. The bug was introduced by commit 2613af0ed18a11d5c566a81f9a6510b73180660a (virtio_net: migrate mergeable rx buffers to page frag allocators). Cc: Michael Dalton mwdal...@google.com Cc: Eric Dumazet eduma...@google.com Cc: Rusty Russell ru...@rustcorp.com.au Cc: Michael S. Tsirkin m...@redhat.com Signed-off-by: Jason Wang jasow...@redhat.com --- The patch was needed for 3.12 stable. Good catch, but if we return from receive_mergeable() in the 'middle' of the frags we would need for the current skb, who will call the virtqueue_get_buf() to flush the remaining frags ? Don't we also need to call virtqueue_get_buf() like while (--num_buf) { buf = virtqueue_get_buf(rq-vq, len); if (!buf) break; put_page(virt_to_head_page(buf)); } ? Let me explain what worries me in your suggestion: struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC); if (unlikely(!nskb)) { head_skb-dev-stats.rx_dropped++; return -ENOMEM; } is this the failure case we are talking about? I thought Jason patch was about this, no ? I think this is a symprom of a larger problem introduced by 2613af0ed18a11d5c566a81f9a6510b73180660a, namely that we now need to allocate memory in the middle of processing a packet. I think discarding a completely valid and well-formed packet from the receive queue because we are unable to allocate new memory with GFP_ATOMIC for future packets is not a good idea. How is it different with NIC processing in RX path ? It certainly violates the principle of least surprize: when one sees host pass packet to guest, one expects the packet to get into the networking stack, not get dropped by the driver internally. Guest stack can do with the packet what it sees fit. We actually wake up a thread if we can't fill up the queue, that will fill it up in GFP_KERNEL context. So I think we should find a way to pre-allocate if necessary and avoid error paths where allocating new memory is a required to avoid drops. Really, under ATOMIC context, there is no way you can avoid dropping packets if you cannot allocate memory. If you cannot allocate sk_buff (256 bytes !!), you wont be able to allocate the 1500+ bytes to hold the payload of next packets anyway. Same problem on a real NIC. Under memory pressure we _do_ packet drops. Nobody really complained. Sure, you can add yet another cache of pre-allocated skbs and pay the price of managing yet another cache layer, but still need to trop packets under stress. Pre-allocating skb on real NIC has a performance cost, because we clear sk_buff way ahead of time. By the time skb is finally received, cpu has to bring back into its cache memory cache lines. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
Great catch Jason. I agree this now raises the larger issue of how to handle a memory alloc failure in the middle of receive. As Eric mentioned, we can drop the packet and free the remaining (num_buf) frags. Michael, perhaps I'm missing something, but why would you prefer pre-allocating buffers in this case? If the guest kernel is OOM'ing, dropping packets should provide backpressure. Also, we could just as easily fail the initial skb alloc in page_to_skb, and I think that case also needs to be handled now in the same fashion as a memory allocation failure in receive_mergeable. Best, Mike ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Tue, Nov 19, 2013 at 01:36:36PM -0800, Eric Dumazet wrote: On Tue, 2013-11-19 at 22:49 +0200, Michael S. Tsirkin wrote: On Tue, Nov 19, 2013 at 06:03:48AM -0800, Eric Dumazet wrote: On Tue, 2013-11-19 at 16:05 +0800, Jason Wang wrote: We need to drop the refcnt of page when we fail to allocate an skb for frag list, otherwise it will be leaked. The bug was introduced by commit 2613af0ed18a11d5c566a81f9a6510b73180660a (virtio_net: migrate mergeable rx buffers to page frag allocators). Cc: Michael Dalton mwdal...@google.com Cc: Eric Dumazet eduma...@google.com Cc: Rusty Russell ru...@rustcorp.com.au Cc: Michael S. Tsirkin m...@redhat.com Signed-off-by: Jason Wang jasow...@redhat.com --- The patch was needed for 3.12 stable. Good catch, but if we return from receive_mergeable() in the 'middle' of the frags we would need for the current skb, who will call the virtqueue_get_buf() to flush the remaining frags ? Don't we also need to call virtqueue_get_buf() like while (--num_buf) { buf = virtqueue_get_buf(rq-vq, len); if (!buf) break; put_page(virt_to_head_page(buf)); } ? Let me explain what worries me in your suggestion: struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC); if (unlikely(!nskb)) { head_skb-dev-stats.rx_dropped++; return -ENOMEM; } is this the failure case we are talking about? I thought Jason patch was about this, no ? I think this is a symprom of a larger problem introduced by 2613af0ed18a11d5c566a81f9a6510b73180660a, namely that we now need to allocate memory in the middle of processing a packet. I think discarding a completely valid and well-formed packet from the receive queue because we are unable to allocate new memory with GFP_ATOMIC for future packets is not a good idea. How is it different with NIC processing in RX path ? Which NIC? Virtio? Prior to 2613af0ed18a11d5c566a81f9a6510b73180660a it didn't drop packets received from host as far as I can tell. virtio is more like a pipe than a real NIC in this respect. It certainly violates the principle of least surprize: when one sees host pass packet to guest, one expects the packet to get into the networking stack, not get dropped by the driver internally. Guest stack can do with the packet what it sees fit. We actually wake up a thread if we can't fill up the queue, that will fill it up in GFP_KERNEL context. So I think we should find a way to pre-allocate if necessary and avoid error paths where allocating new memory is a required to avoid drops. Really, under ATOMIC context, there is no way you can avoid dropping packets if you cannot allocate memory. If you cannot allocate sk_buff (256 bytes !!), you wont be able to allocate the 1500+ bytes to hold the payload of next packets anyway. that's why we do: if (!try_fill_recv(rq, GFP_ATOMIC)) schedule_delayed_work(vi-refill, 0); the queues are large enough for a single failure not to be an immediate problem. Same problem on a real NIC. Under memory pressure we _do_ packet drops. Nobody really complained. Sure, you can add yet another cache of pre-allocated skbs and pay the price of managing yet another cache layer, but still need to trop packets under stress. We don't need a cache even. Just enough to avoid dropping packets if allocation failed in the middle so we don't dequeue a buffer and then drop it. Once we use this reserved skb, we stop processing the queue until refill gives it back. Pre-allocating skb on real NIC has a performance cost, because we clear sk_buff way ahead of time. By the time skb is finally received, cpu has to bring back into its cache memory cache lines. Alternatively we can pre-allocate the memory but avoid clearing it maybe? -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On Tue, 2013-11-19 at 23:53 +0200, Michael S. Tsirkin wrote: Which NIC? Virtio? Prior to 2613af0ed18a11d5c566a81f9a6510b73180660a it didn't drop packets received from host as far as I can tell. virtio is more like a pipe than a real NIC in this respect. Prior/after to this patch, you were not posting buffers, so if packets were received on a physical NIC, you were dropping the packets anyway. It makes no difference at all, adding a cushion might make you feel better, but its really not worth it. Under memory stress, it makes better sense to drop a super big GRO packet (The one needing frag_list extension ...) It gives a better signal to the sender to reduce its pressure, and gives opportunity to free more of your memory. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
Hi, After further reflection I think we're looking at two related issues: (a) a memory leak that Jason has identified that occurs when a memory allocation fails in receive_mergeable. Jasons commit solves this issue. (b) virtio-net does not dequeue all buffers for a packet in the case that an error occurs on receive and mergeable receive buffers is enabled. For (a), this bug is new and due to changes in 2613af0ed18a, and the net impact is memory leak on the physical page. However, I believe (b) has always been possible in some form because if page_to_skb() returns NULL (e.g., due to SKB allocation failure), receive_mergeable is never called. AFAICT this is also the behavior prior to 2613af0ed18a. The net impact of (b) would be that virtio-net would interpret a packet buffer that is in the middle of a mergeable packet as the start of a new packet, which is definitely also a bug (and the buffer contents could contain bytes that resembled a valid virtio-net header). A solution for (b) will require handling both the page_to_skb memory allocation failures and the memory allocation failures in receive_mergeable introduced by 2613af0ed18a. Best, Mike ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On 11/19/2013 10:03 PM, Eric Dumazet wrote: On Tue, 2013-11-19 at 16:05 +0800, Jason Wang wrote: We need to drop the refcnt of page when we fail to allocate an skb for frag list, otherwise it will be leaked. The bug was introduced by commit 2613af0ed18a11d5c566a81f9a6510b73180660a (virtio_net: migrate mergeable rx buffers to page frag allocators). Cc: Michael Dalton mwdal...@google.com Cc: Eric Dumazet eduma...@google.com Cc: Rusty Russell ru...@rustcorp.com.au Cc: Michael S. Tsirkin m...@redhat.com Signed-off-by: Jason Wang jasow...@redhat.com --- The patch was needed for 3.12 stable. Good catch, but if we return from receive_mergeable() in the 'middle' of the frags we would need for the current skb, who will call the virtqueue_get_buf() to flush the remaining frags ? Don't we also need to call virtqueue_get_buf() like while (--num_buf) { buf = virtqueue_get_buf(rq-vq, len); if (!buf) break; put_page(virt_to_head_page(buf)); } ? Yes we need this, will send V2. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On 11/20/2013 04:49 AM, Michael S. Tsirkin wrote: On Tue, Nov 19, 2013 at 06:03:48AM -0800, Eric Dumazet wrote: On Tue, 2013-11-19 at 16:05 +0800, Jason Wang wrote: We need to drop the refcnt of page when we fail to allocate an skb for frag list, otherwise it will be leaked. The bug was introduced by commit 2613af0ed18a11d5c566a81f9a6510b73180660a (virtio_net: migrate mergeable rx buffers to page frag allocators). Cc: Michael Dalton mwdal...@google.com Cc: Eric Dumazet eduma...@google.com Cc: Rusty Russell ru...@rustcorp.com.au Cc: Michael S. Tsirkin m...@redhat.com Signed-off-by: Jason Wang jasow...@redhat.com --- The patch was needed for 3.12 stable. Good catch, but if we return from receive_mergeable() in the 'middle' of the frags we would need for the current skb, who will call the virtqueue_get_buf() to flush the remaining frags ? Don't we also need to call virtqueue_get_buf() like while (--num_buf) { buf = virtqueue_get_buf(rq-vq, len); if (!buf) break; put_page(virt_to_head_page(buf)); } ? Let me explain what worries me in your suggestion: struct sk_buff *nskb = alloc_skb(0, GFP_ATOMIC); if (unlikely(!nskb)) { head_skb-dev-stats.rx_dropped++; return -ENOMEM; } is this the failure case we are talking about? I think this is a symprom of a larger problem introduced by 2613af0ed18a11d5c566a81f9a6510b73180660a, namely that we now need to allocate memory in the middle of processing a packet. I think discarding a completely valid and well-formed packet from the receive queue because we are unable to allocate new memory with GFP_ATOMIC for future packets is not a good idea. It certainly violates the principle of least surprize: when one sees host pass packet to guest, one expects the packet to get into the networking stack, not get dropped by the driver internally. Guest stack can do with the packet what it sees fit. We actually wake up a thread if we can't fill up the queue, that will fill it up in GFP_KERNEL context. So I think we should find a way to pre-allocate if necessary and avoid error paths where allocating new memory is a required to avoid drops. The problem happens only on memory pressure, this pre-allocation may add more stress on this. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH net] virtio-net: fix page refcnt leaking when fail to allocate frag skb
On 11/20/2013 09:34 AM, Michael Dalton wrote: Hi, After further reflection I think we're looking at two related issues: (a) a memory leak that Jason has identified that occurs when a memory allocation fails in receive_mergeable. Jasons commit solves this issue. (b) virtio-net does not dequeue all buffers for a packet in the case that an error occurs on receive and mergeable receive buffers is enabled. For (a), this bug is new and due to changes in 2613af0ed18a, and the net impact is memory leak on the physical page. However, I believe (b) has always been possible in some form because if page_to_skb() returns NULL (e.g., due to SKB allocation failure), receive_mergeable is never called. AFAICT this is also the behavior prior to 2613af0ed18a. The net impact of (b) would be that virtio-net would interpret a packet buffer that is in the middle of a mergeable packet as the start of a new packet, which is definitely also a bug (and the buffer contents could contain bytes that resembled a valid virtio-net header). A solution for (b) will require handling both the page_to_skb memory allocation failures and the memory allocation failures in receive_mergeable introduced by 2613af0ed18a. Ture, so we first need a patch to solve page_to_skb() failure which could be used for stable tree prior to 2613af0ed18a. Then another patch to solve the issue introduced by 2613af0ed18a which could be only used for 3.12 stable. Will draft patches for them. Thanks Best, Mike ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization