Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On Mon, Jan 23, 2017 at 02:12:47PM -0800, John Fastabend wrote: > On 17-01-23 12:09 PM, Michael S. Tsirkin wrote: > > On Mon, Jan 23, 2017 at 09:22:36PM +0200, Michael S. Tsirkin wrote: > >> On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: > >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > >>> index 62dbf4b..3b129b4 100644 > >>> --- a/drivers/net/virtio_net.c > >>> +++ b/drivers/net/virtio_net.c > >>> @@ -41,6 +41,9 @@ > >>> #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN) > >>> #define GOOD_COPY_LEN128 > >>> > >>> +/* Amount of XDP headroom to prepend to packets for use by > >>> xdp_adjust_head */ > >>> +#define VIRTIO_XDP_HEADROOM 256 > >>> + > >>> /* RX packet size EWMA. The average packet size is used to determine the > >>> packet > >>> * buffer size when refilling RX rings. As the entire RX ring may be > >>> refilled > >>> * at once, the weight is chosen so that the EWMA will be insensitive to > >>> short- > >> > >> I wonder where does this number come from? This is quite a lot and > >> means that using XDP_PASS will slow down any sockets on top of it. > >> Which in turn means people will try to remove XDP when not in use, > >> causing resets. E.g. build_skb (which I have a patch to switch to) uses > >> a much more reasonable NET_SKB_PAD. > > I just used the value Alexei (or someone?) came up with. I think it needs to > be > large enough to avoid copy in header encap cases. So minimum > > VXLAN_HDR + OUTER_UDP + OUTER_IPV6_HDR + OUTER_MAC = > 8 + 8 +40 + 14 = 70 > > The choice of VXLAN hdr was sort of arbitrary but seems good for estimates. > For > what its worth there is also a ndo_set_rx_headroom could we use that to set it > and choose a reasonable default. > > >> > >> -- > >> MST > > > > > > Let me show you a patch that I've been cooking. What is missing there > > is handling corner cases like e.g. when ring size is ~4 entries so > > using smaller buffers might mean we no longer have enough space to store > > a full packet. So it looks like I have to maintain the skb copy path > > for this hardware. > > > > With this patch, standard configuration has NET_SKB_PAD + NET_IP_ALIGN > > bytes head padding. Would this be enough for XDP? If yes we do not > > need the resets. > > Based on above seems a bit small (L1_CACHE_BYTES + 2)? How tricky would it > be to add support for ndo_set_rx_headroom. Donnu but then what? Expose it to userspace and let admin make the decision for us? > > > > Thoughts? > > I'll take a look at the patch this afternoon. Thanks. > > > > > ---> > > > > virtio_net: switch to build_skb for mrg_rxbuf > > > > For small packets data copy was observed to > > take up about 15% CPU time. Switch to build_skb > > and avoid the copy when using mergeable rx buffers. > > > > As a bonus, medium-size skbs that fit in a page will be > > completely linear. > > > > Of course, we now need to lower the lower bound on packet size, > > to make sure a sane number of skbs fits in rx socket buffer. > > By how much? I don't know yet. > > > > It might also be useful to prefetch the packet buffer since > > net stack will likely use it soon. > > > > Lightly tested, in particular, I didn't yet test what this > > actually does to performance - sending this out for early > > feedback/flames. > > > > TODO: it appears that Linux won't handle correctly the case of first > > buffer being very small (or consisting exclusively of virtio header). > > This is already the case for current code, need to fix. > > TODO: might be unfair to the last packet in a fragment as we include > > remaining space if any in its truesize. > > > > Signed-off-by: Michael S. Tsirkin > > > > > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > index b425fa1..a6b996f 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -38,6 +38,8 @@ module_param(gso, bool, 0444); > > > > /* FIXME: MTU in config. */ > > #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN) > > +//#define MIN_PACKET_ALLOC GOOD_PACKET_LEN > > +#define MIN_PACKET_ALLOC 128 > > #define GOOD_COPY_LEN 128 > > > > /* RX packet size EWMA. The average packet size is used to determine the > > packet > > @@ -246,6 +248,9 @@ static void *mergeable_ctx_to_buf_address(unsigned long > > mrg_ctx) > > static unsigned long mergeable_buf_to_ctx(void *buf, unsigned int truesize) > > { > > unsigned int size = truesize / MERGEABLE_BUFFER_ALIGN; > > + > > + BUG_ON((unsigned long)buf & (MERGEABLE_BUFFER_ALIGN - 1)); > > + BUG_ON(size - 1 >= MERGEABLE_BUFFER_ALIGN); > > return (unsigned long)buf | (size - 1); > > } > > > > @@ -354,25 +359,54 @@ static struct sk_buff *receive_big(struct net_device > > *dev, > > return NULL; > > } > > > > +#define VNET_SKB_PAD (NET_SKB_PAD + NET_IP_ALIGN) > > +#define VNET_SKB_BUG (VNET_SKB_PAD < sizeof(struct > > virtio_ne
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On 17-01-23 12:09 PM, Michael S. Tsirkin wrote: > On Mon, Jan 23, 2017 at 09:22:36PM +0200, Michael S. Tsirkin wrote: >> On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c >>> index 62dbf4b..3b129b4 100644 >>> --- a/drivers/net/virtio_net.c >>> +++ b/drivers/net/virtio_net.c >>> @@ -41,6 +41,9 @@ >>> #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN) >>> #define GOOD_COPY_LEN 128 >>> >>> +/* Amount of XDP headroom to prepend to packets for use by xdp_adjust_head >>> */ >>> +#define VIRTIO_XDP_HEADROOM 256 >>> + >>> /* RX packet size EWMA. The average packet size is used to determine the >>> packet >>> * buffer size when refilling RX rings. As the entire RX ring may be >>> refilled >>> * at once, the weight is chosen so that the EWMA will be insensitive to >>> short- >> >> I wonder where does this number come from? This is quite a lot and >> means that using XDP_PASS will slow down any sockets on top of it. >> Which in turn means people will try to remove XDP when not in use, >> causing resets. E.g. build_skb (which I have a patch to switch to) uses >> a much more reasonable NET_SKB_PAD. I just used the value Alexei (or someone?) came up with. I think it needs to be large enough to avoid copy in header encap cases. So minimum VXLAN_HDR + OUTER_UDP + OUTER_IPV6_HDR + OUTER_MAC = 8 + 8 +40 + 14 = 70 The choice of VXLAN hdr was sort of arbitrary but seems good for estimates. For what its worth there is also a ndo_set_rx_headroom could we use that to set it and choose a reasonable default. >> >> -- >> MST > > > Let me show you a patch that I've been cooking. What is missing there > is handling corner cases like e.g. when ring size is ~4 entries so > using smaller buffers might mean we no longer have enough space to store > a full packet. So it looks like I have to maintain the skb copy path > for this hardware. > > With this patch, standard configuration has NET_SKB_PAD + NET_IP_ALIGN > bytes head padding. Would this be enough for XDP? If yes we do not > need the resets. Based on above seems a bit small (L1_CACHE_BYTES + 2)? How tricky would it be to add support for ndo_set_rx_headroom. > > Thoughts? I'll take a look at the patch this afternoon. Thanks. > > ---> > > virtio_net: switch to build_skb for mrg_rxbuf > > For small packets data copy was observed to > take up about 15% CPU time. Switch to build_skb > and avoid the copy when using mergeable rx buffers. > > As a bonus, medium-size skbs that fit in a page will be > completely linear. > > Of course, we now need to lower the lower bound on packet size, > to make sure a sane number of skbs fits in rx socket buffer. > By how much? I don't know yet. > > It might also be useful to prefetch the packet buffer since > net stack will likely use it soon. > > Lightly tested, in particular, I didn't yet test what this > actually does to performance - sending this out for early > feedback/flames. > > TODO: it appears that Linux won't handle correctly the case of first > buffer being very small (or consisting exclusively of virtio header). > This is already the case for current code, need to fix. > TODO: might be unfair to the last packet in a fragment as we include > remaining space if any in its truesize. > > Signed-off-by: Michael S. Tsirkin > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index b425fa1..a6b996f 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -38,6 +38,8 @@ module_param(gso, bool, 0444); > > /* FIXME: MTU in config. */ > #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN) > +//#define MIN_PACKET_ALLOC GOOD_PACKET_LEN > +#define MIN_PACKET_ALLOC 128 > #define GOOD_COPY_LEN128 > > /* RX packet size EWMA. The average packet size is used to determine the > packet > @@ -246,6 +248,9 @@ static void *mergeable_ctx_to_buf_address(unsigned long > mrg_ctx) > static unsigned long mergeable_buf_to_ctx(void *buf, unsigned int truesize) > { > unsigned int size = truesize / MERGEABLE_BUFFER_ALIGN; > + > + BUG_ON((unsigned long)buf & (MERGEABLE_BUFFER_ALIGN - 1)); > + BUG_ON(size - 1 >= MERGEABLE_BUFFER_ALIGN); > return (unsigned long)buf | (size - 1); > } > > @@ -354,25 +359,54 @@ static struct sk_buff *receive_big(struct net_device > *dev, > return NULL; > } > > +#define VNET_SKB_PAD (NET_SKB_PAD + NET_IP_ALIGN) > +#define VNET_SKB_BUG (VNET_SKB_PAD < sizeof(struct virtio_net_hdr_mrg_rxbuf)) > +#define VNET_SKB_LEN(len) ((len) - sizeof(struct virtio_net_hdr_mrg_rxbuf)) > +#define VNET_SKB_OFF VNET_SKB_LEN(VNET_SKB_PAD) > + > +static struct sk_buff *vnet_build_skb(struct virtnet_info *vi, > + void *buf, > + unsigned int len, unsigned int truesize) > +{ > + struct sk_buff *skb = build_skb(buf, t
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On Mon, Jan 23, 2017 at 09:22:36PM +0200, Michael S. Tsirkin wrote: > On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > index 62dbf4b..3b129b4 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -41,6 +41,9 @@ > > #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN) > > #define GOOD_COPY_LEN 128 > > > > +/* Amount of XDP headroom to prepend to packets for use by xdp_adjust_head > > */ > > +#define VIRTIO_XDP_HEADROOM 256 > > + > > /* RX packet size EWMA. The average packet size is used to determine the > > packet > > * buffer size when refilling RX rings. As the entire RX ring may be > > refilled > > * at once, the weight is chosen so that the EWMA will be insensitive to > > short- > > I wonder where does this number come from? This is quite a lot and > means that using XDP_PASS will slow down any sockets on top of it. > Which in turn means people will try to remove XDP when not in use, > causing resets. E.g. build_skb (which I have a patch to switch to) uses > a much more reasonable NET_SKB_PAD. > > -- > MST Let me show you a patch that I've been cooking. What is missing there is handling corner cases like e.g. when ring size is ~4 entries so using smaller buffers might mean we no longer have enough space to store a full packet. So it looks like I have to maintain the skb copy path for this hardware. With this patch, standard configuration has NET_SKB_PAD + NET_IP_ALIGN bytes head padding. Would this be enough for XDP? If yes we do not need the resets. Thoughts? ---> virtio_net: switch to build_skb for mrg_rxbuf For small packets data copy was observed to take up about 15% CPU time. Switch to build_skb and avoid the copy when using mergeable rx buffers. As a bonus, medium-size skbs that fit in a page will be completely linear. Of course, we now need to lower the lower bound on packet size, to make sure a sane number of skbs fits in rx socket buffer. By how much? I don't know yet. It might also be useful to prefetch the packet buffer since net stack will likely use it soon. Lightly tested, in particular, I didn't yet test what this actually does to performance - sending this out for early feedback/flames. TODO: it appears that Linux won't handle correctly the case of first buffer being very small (or consisting exclusively of virtio header). This is already the case for current code, need to fix. TODO: might be unfair to the last packet in a fragment as we include remaining space if any in its truesize. Signed-off-by: Michael S. Tsirkin diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index b425fa1..a6b996f 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -38,6 +38,8 @@ module_param(gso, bool, 0444); /* FIXME: MTU in config. */ #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN) +//#define MIN_PACKET_ALLOC GOOD_PACKET_LEN +#define MIN_PACKET_ALLOC 128 #define GOOD_COPY_LEN 128 /* RX packet size EWMA. The average packet size is used to determine the packet @@ -246,6 +248,9 @@ static void *mergeable_ctx_to_buf_address(unsigned long mrg_ctx) static unsigned long mergeable_buf_to_ctx(void *buf, unsigned int truesize) { unsigned int size = truesize / MERGEABLE_BUFFER_ALIGN; + + BUG_ON((unsigned long)buf & (MERGEABLE_BUFFER_ALIGN - 1)); + BUG_ON(size - 1 >= MERGEABLE_BUFFER_ALIGN); return (unsigned long)buf | (size - 1); } @@ -354,25 +359,54 @@ static struct sk_buff *receive_big(struct net_device *dev, return NULL; } +#define VNET_SKB_PAD (NET_SKB_PAD + NET_IP_ALIGN) +#define VNET_SKB_BUG (VNET_SKB_PAD < sizeof(struct virtio_net_hdr_mrg_rxbuf)) +#define VNET_SKB_LEN(len) ((len) - sizeof(struct virtio_net_hdr_mrg_rxbuf)) +#define VNET_SKB_OFF VNET_SKB_LEN(VNET_SKB_PAD) + +static struct sk_buff *vnet_build_skb(struct virtnet_info *vi, + void *buf, + unsigned int len, unsigned int truesize) +{ + struct sk_buff *skb = build_skb(buf, truesize); + + if (!skb) + return NULL; + + skb_reserve(skb, VNET_SKB_PAD); + skb_put(skb, VNET_SKB_LEN(len)); + + return skb; +} + static struct sk_buff *receive_mergeable(struct net_device *dev, struct virtnet_info *vi, struct receive_queue *rq, unsigned long ctx, -unsigned int len) +unsigned int len, +struct virtio_net_hdr_mrg_rxbuf *hdr) { void *buf = mergeable_ctx_to_buf_address(ctx); - struct virtio_net_hdr_mrg_rxbuf *hdr = buf; - u16 num_buf = virtio16_to_cpu(vi->vdev, hdr->num_buffers); + u16 num_buf; struct page *page
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 62dbf4b..3b129b4 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -41,6 +41,9 @@ > #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN) > #define GOOD_COPY_LEN128 > > +/* Amount of XDP headroom to prepend to packets for use by xdp_adjust_head */ > +#define VIRTIO_XDP_HEADROOM 256 > + > /* RX packet size EWMA. The average packet size is used to determine the > packet > * buffer size when refilling RX rings. As the entire RX ring may be refilled > * at once, the weight is chosen so that the EWMA will be insensitive to > short- I wonder where does this number come from? This is quite a lot and means that using XDP_PASS will slow down any sockets on top of it. Which in turn means people will try to remove XDP when not in use, causing resets. E.g. build_skb (which I have a patch to switch to) uses a much more reasonable NET_SKB_PAD. -- MST
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On Sat, Jan 21, 2017 at 08:14:19PM -0800, John Fastabend wrote: > On 17-01-21 06:51 PM, Jason Wang wrote: > > > > > > On 2017年01月21日 01:48, Michael S. Tsirkin wrote: > >> On Fri, Jan 20, 2017 at 04:59:11PM +, David Laight wrote: > >>> From: Michael S. Tsirkin > Sent: 19 January 2017 21:12 > > On 2017?01?18? 23:15, Michael S. Tsirkin wrote: > >> On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: > >>> Add support for XDP adjust head by allocating a 256B header region > >>> that XDP programs can grow into. This is only enabled when a XDP > >>> program is loaded. > >>> > >>> In order to ensure that we do not have to unwind queue headroom push > >>> queue setup below bpf_prog_add. It reads better to do a prog ref > >>> unwind vs another queue setup call. > >>> > >>> At the moment this code must do a full reset to ensure old buffers > >>> without headroom on program add or with headroom on program removal > >>> are not used incorrectly in the datapath. Ideally we would only > >>> have to disable/enable the RX queues being updated but there is no > >>> API to do this at the moment in virtio so use the big hammer. In > >>> practice it is likely not that big of a problem as this will only > >>> happen when XDP is enabled/disabled changing programs does not > >>> require the reset. There is some risk that the driver may either > >>> have an allocation failure or for some reason fail to correctly > >>> negotiate with the underlying backend in this case the driver will > >>> be left uninitialized. I have not seen this ever happen on my test > >>> systems and for what its worth this same failure case can occur > >>> from probe and other contexts in virtio framework. > >>> > >>> Signed-off-by: John Fastabend > >> I've been thinking about it - can't we drop > >> old buffers without the head room which were posted before > >> xdp attached? > >> > >> Avoiding the reset would be much nicer. > >> > >> Thoughts? > >> > > As been discussed before, device may use them in the same time so it's > > not > > safe. Or do you mean detect them after xdp were set and drop the buffer > > without head room, this looks sub-optimal. > > > > Thanks > Yes, this is what I mean. Why is this suboptimal? It's a single branch > in code. Yes we might lose some packets but the big hammer of device > reset will likely lose more. > >>> Why not leave let the hardware receive into the 'small' buffer (without > >>> headroom) and do a copy when a frame is received. > >>> Replace the buffers with 'big' ones for the next receive. > >>> A data copy on a ring full of buffers won't really be noticed. > >>> > >>> David > >>> > >> I like that. John? > >> > > > > This works, I prefer this only if it uses simpler code (but I suspect) than > > reset. > > > > Thanks > > Before the reset path I looked at doing this but it seems to require tracking > if a buffer had headroom on a per buffer basis. I don't see a good spot to > put a bit like this? It could go in the inbuf 'ctx' added by > virtqueue_add_inbuf > but I would need to change the current usage of ctx which in the mergeable > case > at least is just a simple pointer today. I don't like this because it > complicates the normal path and the XDP hotpath. > > Otherwise we could somehow mark the ring at the point where XDP is enabled so > that it can learn when a full iteration around the ring. But I can't see a > simple way to make this work either. > > I don't know the reset look straight forward to me and although not ideal is > fairly common on hardware based drivers during configuration changes. I'm open > to any ideas on where to put the metadata to track headroom though. > > Thanks, > John Well with 4K pages we actually have 4 spare bits to use. In fact this means we could reduce the mergeable buffer alignment. It starts getting strange with 64K pages where we are out of space, and I just noticed that with bigger pages virtio is actually broken. So let me fix it up first of all, and on top - maybe we can just increase the alignment for 64k pages and up? Truesize alignment to 512 is still reasonable and presumably these 64k page boxes have lots of memory. Would it make sense to tweak SK_RMEM_MAX up for larger page sizes? -- MST
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On 17-01-21 06:51 PM, Jason Wang wrote: > > > On 2017年01月21日 01:48, Michael S. Tsirkin wrote: >> On Fri, Jan 20, 2017 at 04:59:11PM +, David Laight wrote: >>> From: Michael S. Tsirkin Sent: 19 January 2017 21:12 > On 2017?01?18? 23:15, Michael S. Tsirkin wrote: >> On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: >>> Add support for XDP adjust head by allocating a 256B header region >>> that XDP programs can grow into. This is only enabled when a XDP >>> program is loaded. >>> >>> In order to ensure that we do not have to unwind queue headroom push >>> queue setup below bpf_prog_add. It reads better to do a prog ref >>> unwind vs another queue setup call. >>> >>> At the moment this code must do a full reset to ensure old buffers >>> without headroom on program add or with headroom on program removal >>> are not used incorrectly in the datapath. Ideally we would only >>> have to disable/enable the RX queues being updated but there is no >>> API to do this at the moment in virtio so use the big hammer. In >>> practice it is likely not that big of a problem as this will only >>> happen when XDP is enabled/disabled changing programs does not >>> require the reset. There is some risk that the driver may either >>> have an allocation failure or for some reason fail to correctly >>> negotiate with the underlying backend in this case the driver will >>> be left uninitialized. I have not seen this ever happen on my test >>> systems and for what its worth this same failure case can occur >>> from probe and other contexts in virtio framework. >>> >>> Signed-off-by: John Fastabend >> I've been thinking about it - can't we drop >> old buffers without the head room which were posted before >> xdp attached? >> >> Avoiding the reset would be much nicer. >> >> Thoughts? >> > As been discussed before, device may use them in the same time so it's not > safe. Or do you mean detect them after xdp were set and drop the buffer > without head room, this looks sub-optimal. > > Thanks Yes, this is what I mean. Why is this suboptimal? It's a single branch in code. Yes we might lose some packets but the big hammer of device reset will likely lose more. >>> Why not leave let the hardware receive into the 'small' buffer (without >>> headroom) and do a copy when a frame is received. >>> Replace the buffers with 'big' ones for the next receive. >>> A data copy on a ring full of buffers won't really be noticed. >>> >>> David >>> >> I like that. John? >> > > This works, I prefer this only if it uses simpler code (but I suspect) than > reset. > > Thanks Before the reset path I looked at doing this but it seems to require tracking if a buffer had headroom on a per buffer basis. I don't see a good spot to put a bit like this? It could go in the inbuf 'ctx' added by virtqueue_add_inbuf but I would need to change the current usage of ctx which in the mergeable case at least is just a simple pointer today. I don't like this because it complicates the normal path and the XDP hotpath. Otherwise we could somehow mark the ring at the point where XDP is enabled so that it can learn when a full iteration around the ring. But I can't see a simple way to make this work either. I don't know the reset look straight forward to me and although not ideal is fairly common on hardware based drivers during configuration changes. I'm open to any ideas on where to put the metadata to track headroom though. Thanks, John
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On 2017年01月21日 01:48, Michael S. Tsirkin wrote: On Fri, Jan 20, 2017 at 04:59:11PM +, David Laight wrote: From: Michael S. Tsirkin Sent: 19 January 2017 21:12 On 2017?01?18? 23:15, Michael S. Tsirkin wrote: On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: Add support for XDP adjust head by allocating a 256B header region that XDP programs can grow into. This is only enabled when a XDP program is loaded. In order to ensure that we do not have to unwind queue headroom push queue setup below bpf_prog_add. It reads better to do a prog ref unwind vs another queue setup call. At the moment this code must do a full reset to ensure old buffers without headroom on program add or with headroom on program removal are not used incorrectly in the datapath. Ideally we would only have to disable/enable the RX queues being updated but there is no API to do this at the moment in virtio so use the big hammer. In practice it is likely not that big of a problem as this will only happen when XDP is enabled/disabled changing programs does not require the reset. There is some risk that the driver may either have an allocation failure or for some reason fail to correctly negotiate with the underlying backend in this case the driver will be left uninitialized. I have not seen this ever happen on my test systems and for what its worth this same failure case can occur from probe and other contexts in virtio framework. Signed-off-by: John Fastabend I've been thinking about it - can't we drop old buffers without the head room which were posted before xdp attached? Avoiding the reset would be much nicer. Thoughts? As been discussed before, device may use them in the same time so it's not safe. Or do you mean detect them after xdp were set and drop the buffer without head room, this looks sub-optimal. Thanks Yes, this is what I mean. Why is this suboptimal? It's a single branch in code. Yes we might lose some packets but the big hammer of device reset will likely lose more. Why not leave let the hardware receive into the 'small' buffer (without headroom) and do a copy when a frame is received. Replace the buffers with 'big' ones for the next receive. A data copy on a ring full of buffers won't really be noticed. David I like that. John? This works, I prefer this only if it uses simpler code (but I suspect) than reset. Thanks
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On Fri, Jan 20, 2017 at 04:59:11PM +, David Laight wrote: > From: Michael S. Tsirkin > > Sent: 19 January 2017 21:12 > > > On 2017?01?18? 23:15, Michael S. Tsirkin wrote: > > > > On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: > > > > > Add support for XDP adjust head by allocating a 256B header region > > > > > that XDP programs can grow into. This is only enabled when a XDP > > > > > program is loaded. > > > > > > > > > > In order to ensure that we do not have to unwind queue headroom push > > > > > queue setup below bpf_prog_add. It reads better to do a prog ref > > > > > unwind vs another queue setup call. > > > > > > > > > > At the moment this code must do a full reset to ensure old buffers > > > > > without headroom on program add or with headroom on program removal > > > > > are not used incorrectly in the datapath. Ideally we would only > > > > > have to disable/enable the RX queues being updated but there is no > > > > > API to do this at the moment in virtio so use the big hammer. In > > > > > practice it is likely not that big of a problem as this will only > > > > > happen when XDP is enabled/disabled changing programs does not > > > > > require the reset. There is some risk that the driver may either > > > > > have an allocation failure or for some reason fail to correctly > > > > > negotiate with the underlying backend in this case the driver will > > > > > be left uninitialized. I have not seen this ever happen on my test > > > > > systems and for what its worth this same failure case can occur > > > > > from probe and other contexts in virtio framework. > > > > > > > > > > Signed-off-by: John Fastabend > > > > I've been thinking about it - can't we drop > > > > old buffers without the head room which were posted before > > > > xdp attached? > > > > > > > > Avoiding the reset would be much nicer. > > > > > > > > Thoughts? > > > > > > > > > > As been discussed before, device may use them in the same time so it's not > > > safe. Or do you mean detect them after xdp were set and drop the buffer > > > without head room, this looks sub-optimal. > > > > > > Thanks > > > > Yes, this is what I mean. Why is this suboptimal? It's a single branch > > in code. Yes we might lose some packets but the big hammer of device > > reset will likely lose more. > > Why not leave let the hardware receive into the 'small' buffer (without > headroom) and do a copy when a frame is received. > Replace the buffers with 'big' ones for the next receive. > A data copy on a ring full of buffers won't really be noticed. > > David > I like that. John? -- MST
RE: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
From: Michael S. Tsirkin > Sent: 19 January 2017 21:12 > > On 2017?01?18? 23:15, Michael S. Tsirkin wrote: > > > On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: > > > > Add support for XDP adjust head by allocating a 256B header region > > > > that XDP programs can grow into. This is only enabled when a XDP > > > > program is loaded. > > > > > > > > In order to ensure that we do not have to unwind queue headroom push > > > > queue setup below bpf_prog_add. It reads better to do a prog ref > > > > unwind vs another queue setup call. > > > > > > > > At the moment this code must do a full reset to ensure old buffers > > > > without headroom on program add or with headroom on program removal > > > > are not used incorrectly in the datapath. Ideally we would only > > > > have to disable/enable the RX queues being updated but there is no > > > > API to do this at the moment in virtio so use the big hammer. In > > > > practice it is likely not that big of a problem as this will only > > > > happen when XDP is enabled/disabled changing programs does not > > > > require the reset. There is some risk that the driver may either > > > > have an allocation failure or for some reason fail to correctly > > > > negotiate with the underlying backend in this case the driver will > > > > be left uninitialized. I have not seen this ever happen on my test > > > > systems and for what its worth this same failure case can occur > > > > from probe and other contexts in virtio framework. > > > > > > > > Signed-off-by: John Fastabend > > > I've been thinking about it - can't we drop > > > old buffers without the head room which were posted before > > > xdp attached? > > > > > > Avoiding the reset would be much nicer. > > > > > > Thoughts? > > > > > > > As been discussed before, device may use them in the same time so it's not > > safe. Or do you mean detect them after xdp were set and drop the buffer > > without head room, this looks sub-optimal. > > > > Thanks > > Yes, this is what I mean. Why is this suboptimal? It's a single branch > in code. Yes we might lose some packets but the big hammer of device > reset will likely lose more. Why not leave let the hardware receive into the 'small' buffer (without headroom) and do a copy when a frame is received. Replace the buffers with 'big' ones for the next receive. A data copy on a ring full of buffers won't really be noticed. David
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On 17-01-19 07:26 PM, Jason Wang wrote: > > > On 2017年01月20日 05:11, Michael S. Tsirkin wrote: >> On Thu, Jan 19, 2017 at 11:05:40AM +0800, Jason Wang wrote: >>> >>> On 2017年01月18日 23:15, Michael S. Tsirkin wrote: On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: > Add support for XDP adjust head by allocating a 256B header region > that XDP programs can grow into. This is only enabled when a XDP > program is loaded. > > In order to ensure that we do not have to unwind queue headroom push > queue setup below bpf_prog_add. It reads better to do a prog ref > unwind vs another queue setup call. > > At the moment this code must do a full reset to ensure old buffers > without headroom on program add or with headroom on program removal > are not used incorrectly in the datapath. Ideally we would only > have to disable/enable the RX queues being updated but there is no > API to do this at the moment in virtio so use the big hammer. In > practice it is likely not that big of a problem as this will only > happen when XDP is enabled/disabled changing programs does not > require the reset. There is some risk that the driver may either > have an allocation failure or for some reason fail to correctly > negotiate with the underlying backend in this case the driver will > be left uninitialized. I have not seen this ever happen on my test > systems and for what its worth this same failure case can occur > from probe and other contexts in virtio framework. > > Signed-off-by: John Fastabend I've been thinking about it - can't we drop old buffers without the head room which were posted before xdp attached? Avoiding the reset would be much nicer. Thoughts? >>> As been discussed before, device may use them in the same time so it's not >>> safe. Or do you mean detect them after xdp were set and drop the buffer >>> without head room, this looks sub-optimal. >>> >>> Thanks >> Yes, this is what I mean. Why is this suboptimal? It's a single branch >> in code. Yes we might lose some packets but the big hammer of device >> reset will likely lose more. >> > > Maybe I was wrong but I think driver should try their best to avoid dropping > packets. (And look at mlx4, it did something similar to this patch). > > Thanks +1 sorry didn't see your reply as I was typing mine. Bottom line when XDP returns I believe the driver must be ready to accept packets or managing XDP will be problematic. .John
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On 17-01-19 01:11 PM, Michael S. Tsirkin wrote: > On Thu, Jan 19, 2017 at 11:05:40AM +0800, Jason Wang wrote: >> >> >> On 2017年01月18日 23:15, Michael S. Tsirkin wrote: >>> On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: Add support for XDP adjust head by allocating a 256B header region that XDP programs can grow into. This is only enabled when a XDP program is loaded. In order to ensure that we do not have to unwind queue headroom push queue setup below bpf_prog_add. It reads better to do a prog ref unwind vs another queue setup call. At the moment this code must do a full reset to ensure old buffers without headroom on program add or with headroom on program removal are not used incorrectly in the datapath. Ideally we would only have to disable/enable the RX queues being updated but there is no API to do this at the moment in virtio so use the big hammer. In practice it is likely not that big of a problem as this will only happen when XDP is enabled/disabled changing programs does not require the reset. There is some risk that the driver may either have an allocation failure or for some reason fail to correctly negotiate with the underlying backend in this case the driver will be left uninitialized. I have not seen this ever happen on my test systems and for what its worth this same failure case can occur from probe and other contexts in virtio framework. Signed-off-by: John Fastabend >>> I've been thinking about it - can't we drop >>> old buffers without the head room which were posted before >>> xdp attached? >>> >>> Avoiding the reset would be much nicer. >>> >>> Thoughts? >>> >> >> As been discussed before, device may use them in the same time so it's not >> safe. Or do you mean detect them after xdp were set and drop the buffer >> without head room, this looks sub-optimal. >> >> Thanks > > Yes, this is what I mean. Why is this suboptimal? It's a single branch > in code. Yes we might lose some packets but the big hammer of device > reset will likely lose more. > Maybe I'm not following, is the suggestion to drop the packets after XDP is setup for all outstanding buffers until we have done a reallocation of all the buffers? In this case we can't just detach the buffers we have to wait until the backend retires them by using them correct? But when XDP setup call returns we need to guarantee that buffers and driver are setup. Otherwise the next n packets get dropped in the future. If there is no traffic currently this could be at some undetermined point in the future. This will be very buggy. Did I miss something? Thanks, John
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On 2017年01月20日 05:11, Michael S. Tsirkin wrote: On Thu, Jan 19, 2017 at 11:05:40AM +0800, Jason Wang wrote: On 2017年01月18日 23:15, Michael S. Tsirkin wrote: On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: Add support for XDP adjust head by allocating a 256B header region that XDP programs can grow into. This is only enabled when a XDP program is loaded. In order to ensure that we do not have to unwind queue headroom push queue setup below bpf_prog_add. It reads better to do a prog ref unwind vs another queue setup call. At the moment this code must do a full reset to ensure old buffers without headroom on program add or with headroom on program removal are not used incorrectly in the datapath. Ideally we would only have to disable/enable the RX queues being updated but there is no API to do this at the moment in virtio so use the big hammer. In practice it is likely not that big of a problem as this will only happen when XDP is enabled/disabled changing programs does not require the reset. There is some risk that the driver may either have an allocation failure or for some reason fail to correctly negotiate with the underlying backend in this case the driver will be left uninitialized. I have not seen this ever happen on my test systems and for what its worth this same failure case can occur from probe and other contexts in virtio framework. Signed-off-by: John Fastabend I've been thinking about it - can't we drop old buffers without the head room which were posted before xdp attached? Avoiding the reset would be much nicer. Thoughts? As been discussed before, device may use them in the same time so it's not safe. Or do you mean detect them after xdp were set and drop the buffer without head room, this looks sub-optimal. Thanks Yes, this is what I mean. Why is this suboptimal? It's a single branch in code. Yes we might lose some packets but the big hammer of device reset will likely lose more. Maybe I was wrong but I think driver should try their best to avoid dropping packets. (And look at mlx4, it did something similar to this patch). Thanks
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On Thu, Jan 19, 2017 at 11:05:40AM +0800, Jason Wang wrote: > > > On 2017年01月18日 23:15, Michael S. Tsirkin wrote: > > On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: > > > Add support for XDP adjust head by allocating a 256B header region > > > that XDP programs can grow into. This is only enabled when a XDP > > > program is loaded. > > > > > > In order to ensure that we do not have to unwind queue headroom push > > > queue setup below bpf_prog_add. It reads better to do a prog ref > > > unwind vs another queue setup call. > > > > > > At the moment this code must do a full reset to ensure old buffers > > > without headroom on program add or with headroom on program removal > > > are not used incorrectly in the datapath. Ideally we would only > > > have to disable/enable the RX queues being updated but there is no > > > API to do this at the moment in virtio so use the big hammer. In > > > practice it is likely not that big of a problem as this will only > > > happen when XDP is enabled/disabled changing programs does not > > > require the reset. There is some risk that the driver may either > > > have an allocation failure or for some reason fail to correctly > > > negotiate with the underlying backend in this case the driver will > > > be left uninitialized. I have not seen this ever happen on my test > > > systems and for what its worth this same failure case can occur > > > from probe and other contexts in virtio framework. > > > > > > Signed-off-by: John Fastabend > > I've been thinking about it - can't we drop > > old buffers without the head room which were posted before > > xdp attached? > > > > Avoiding the reset would be much nicer. > > > > Thoughts? > > > > As been discussed before, device may use them in the same time so it's not > safe. Or do you mean detect them after xdp were set and drop the buffer > without head room, this looks sub-optimal. > > Thanks Yes, this is what I mean. Why is this suboptimal? It's a single branch in code. Yes we might lose some packets but the big hammer of device reset will likely lose more. -- MST
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On 2017年01月18日 23:15, Michael S. Tsirkin wrote: On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: Add support for XDP adjust head by allocating a 256B header region that XDP programs can grow into. This is only enabled when a XDP program is loaded. In order to ensure that we do not have to unwind queue headroom push queue setup below bpf_prog_add. It reads better to do a prog ref unwind vs another queue setup call. At the moment this code must do a full reset to ensure old buffers without headroom on program add or with headroom on program removal are not used incorrectly in the datapath. Ideally we would only have to disable/enable the RX queues being updated but there is no API to do this at the moment in virtio so use the big hammer. In practice it is likely not that big of a problem as this will only happen when XDP is enabled/disabled changing programs does not require the reset. There is some risk that the driver may either have an allocation failure or for some reason fail to correctly negotiate with the underlying backend in this case the driver will be left uninitialized. I have not seen this ever happen on my test systems and for what its worth this same failure case can occur from probe and other contexts in virtio framework. Signed-off-by: John Fastabend I've been thinking about it - can't we drop old buffers without the head room which were posted before xdp attached? Avoiding the reset would be much nicer. Thoughts? As been discussed before, device may use them in the same time so it's not safe. Or do you mean detect them after xdp were set and drop the buffer without head room, this looks sub-optimal. Thanks
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On Tue, Jan 17, 2017 at 02:22:59PM -0800, John Fastabend wrote: > Add support for XDP adjust head by allocating a 256B header region > that XDP programs can grow into. This is only enabled when a XDP > program is loaded. > > In order to ensure that we do not have to unwind queue headroom push > queue setup below bpf_prog_add. It reads better to do a prog ref > unwind vs another queue setup call. > > At the moment this code must do a full reset to ensure old buffers > without headroom on program add or with headroom on program removal > are not used incorrectly in the datapath. Ideally we would only > have to disable/enable the RX queues being updated but there is no > API to do this at the moment in virtio so use the big hammer. In > practice it is likely not that big of a problem as this will only > happen when XDP is enabled/disabled changing programs does not > require the reset. There is some risk that the driver may either > have an allocation failure or for some reason fail to correctly > negotiate with the underlying backend in this case the driver will > be left uninitialized. I have not seen this ever happen on my test > systems and for what its worth this same failure case can occur > from probe and other contexts in virtio framework. > > Signed-off-by: John Fastabend I've been thinking about it - can't we drop old buffers without the head room which were posted before xdp attached? Avoiding the reset would be much nicer. Thoughts? > --- > drivers/net/virtio_net.c | 149 > +++--- > 1 file changed, 125 insertions(+), 24 deletions(-) > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > index 62dbf4b..3b129b4 100644 > --- a/drivers/net/virtio_net.c > +++ b/drivers/net/virtio_net.c > @@ -41,6 +41,9 @@ > #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN) > #define GOOD_COPY_LEN128 > > +/* Amount of XDP headroom to prepend to packets for use by xdp_adjust_head */ > +#define VIRTIO_XDP_HEADROOM 256 > + > /* RX packet size EWMA. The average packet size is used to determine the > packet > * buffer size when refilling RX rings. As the entire RX ring may be refilled > * at once, the weight is chosen so that the EWMA will be insensitive to > short- > @@ -359,6 +362,7 @@ static void virtnet_xdp_xmit(struct virtnet_info *vi, > } > > if (vi->mergeable_rx_bufs) { > + xdp->data -= sizeof(struct virtio_net_hdr_mrg_rxbuf); > /* Zero header and leave csum up to XDP layers */ > hdr = xdp->data; > memset(hdr, 0, vi->hdr_len); > @@ -375,7 +379,9 @@ static void virtnet_xdp_xmit(struct virtnet_info *vi, > num_sg = 2; > sg_init_table(sq->sg, 2); > sg_set_buf(sq->sg, hdr, vi->hdr_len); > - skb_to_sgvec(skb, sq->sg + 1, 0, skb->len); > + skb_to_sgvec(skb, sq->sg + 1, > + xdp->data - xdp->data_hard_start, > + xdp->data_end - xdp->data); > } > err = virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, > data, GFP_ATOMIC); > @@ -401,7 +407,6 @@ static struct sk_buff *receive_small(struct net_device > *dev, > struct bpf_prog *xdp_prog; > > len -= vi->hdr_len; > - skb_trim(skb, len); > > rcu_read_lock(); > xdp_prog = rcu_dereference(rq->xdp_prog); > @@ -413,11 +418,15 @@ static struct sk_buff *receive_small(struct net_device > *dev, > if (unlikely(hdr->hdr.gso_type || hdr->hdr.flags)) > goto err_xdp; > > - xdp.data = skb->data; > + xdp.data_hard_start = skb->data; > + xdp.data = skb->data + VIRTIO_XDP_HEADROOM; > xdp.data_end = xdp.data + len; > act = bpf_prog_run_xdp(xdp_prog, &xdp); > switch (act) { > case XDP_PASS: > + /* Recalculate length in case bpf program changed it */ > + __skb_pull(skb, xdp.data - xdp.data_hard_start); > + len = xdp.data_end - xdp.data; > break; > case XDP_TX: > virtnet_xdp_xmit(vi, rq, &xdp, skb); > @@ -432,6 +441,7 @@ static struct sk_buff *receive_small(struct net_device > *dev, > } > rcu_read_unlock(); > > + skb_trim(skb, len); > return skb; > > err_xdp: > @@ -480,7 +490,7 @@ static struct page *xdp_linearize_page(struct > receive_queue *rq, > unsigned int *len) > { > struct page *page = alloc_page(GFP_ATOMIC); > - unsigned int page_off = 0; > + unsigned int page_off = VIRTIO_XDP_HEADROOM; > > if (!page) > return NULL; > @@ -516,7 +526,8 @@ static struct page *xdp_linearize_page(struct > receive_queue *rq, > put_page(p); > } > > - *len = page_off; > + /* Headroom does not con
Re: [net PATCH v5 6/6] virtio_net: XDP support for adjust_head
On 2017年01月18日 06:22, John Fastabend wrote: +static int virtnet_reset(struct virtnet_info *vi) +{ + struct virtio_device *dev = vi->vdev; + int ret; + + virtio_config_disable(dev); + dev->failed = dev->config->get_status(dev) & VIRTIO_CONFIG_S_FAILED; + virtnet_freeze_down(dev); + _remove_vq_common(vi); + + dev->config->reset(dev); + virtio_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE); + virtio_add_status(dev, VIRTIO_CONFIG_S_DRIVER); + + ret = virtio_finalize_features(dev); + if (ret) + goto err; + + ret = virtnet_restore_up(dev); + if (ret) + goto err; + ret = _virtnet_set_queues(vi, vi->curr_queue_pairs); + if (ret) + goto err; + + virtio_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK); + virtio_config_enable(dev); + return 0; +err: + virtio_add_status(dev, VIRTIO_CONFIG_S_FAILED); + return ret; +} + Hi John: I still prefer not open code (part of) virtio_device_freeze() and virtio_device_restore() here. How about: 1) introduce __virtio_device_freeze/__virtio_device_restore which accepts a function pointer of free/restore 2) for virtio_device_freeze/virtio_device_restore just pass drv->freeze/drv->restore (locked version) 3) for virtnet_reset(), we can pass unlocked version of freeze and restore Just my preference, if both Michael and you stick to this, I'm also fine. Thanks