Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
From: "Michael S. Tsirkin" Date: Tue, 20 Dec 2011 13:15:12 +0200 > On Wed, Dec 07, 2011 at 01:52:35PM -0500, David Miller wrote: >> Once you sort this out, reply with an Acked-by: for me, thanks. > > Acked-by: Michael S. Tsirkin Applied. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On Wed, Dec 07, 2011 at 01:52:35PM -0500, David Miller wrote: > From: "Michael S. Tsirkin" > Date: Wed, 7 Dec 2011 18:10:02 +0200 > > > On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote: > >> From: Krishna Kumar2 > >> Date: Fri, 25 Nov 2011 09:39:11 +0530 > >> > >> > Jason Wang wrote on 11/25/2011 08:51:57 AM: > >> >> > >> >> My description is not clear again :( > >> >> I mean the same vhost thead: > >> >> > >> >> vhost thread #0 transmits packets of flow A on processor M > >> >> ... > >> >> vhost thread #0 move to another process N and start to transmit packets > >> >> of flow A > >> > > >> > Thanks for clarifying. Yes, binding vhosts to CPU's > >> > makes the incoming packet go to the same vhost each > >> > time. BTW, are you doing any binding and/or irqbalance > >> > when you run your tests? I am not running either at > >> > this time, but thought both might be useful. > >> > >> So are we going with this patch or are we saying that vhost binding > >> is a requirement? > > > > OK we didn't come to a conclusion so I would be inclined > > to merge this patch as is for 3.2, and revisit later. > > One question though: do these changes affect userspace > > in any way? For example, will this commit us to > > ensure that a single flow gets a unique hash even > > for strange configurations that transmit the same flow > > from multiple cpus? > > Once you sort this out, reply with an Acked-by: for me, thanks. Acked-by: Michael S. Tsirkin ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On 12/08/2011 12:10 AM, Michael S. Tsirkin wrote: On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote: From: Krishna Kumar2 Date: Fri, 25 Nov 2011 09:39:11 +0530 Jason Wang wrote on 11/25/2011 08:51:57 AM: My description is not clear again :( I mean the same vhost thead: vhost thread #0 transmits packets of flow A on processor M ... vhost thread #0 move to another process N and start to transmit packets of flow A Thanks for clarifying. Yes, binding vhosts to CPU's makes the incoming packet go to the same vhost each time. BTW, are you doing any binding and/or irqbalance when you run your tests? I am not running either at this time, but thought both might be useful. So are we going with this patch or are we saying that vhost binding is a requirement? OK we didn't come to a conclusion so I would be inclined to merge this patch as is for 3.2, and revisit later. One question though: do these changes affect userspace in any way? For example, will this commit us to ensure that a single flow gets a unique hash even for strange configurations that transmit the same flow from multiple cpus? The hash were generated by either host kernel or host nic, so I think it's unique except for the nic that would provide different hashes for a flow. I wonder whether there's a such nic. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
From: "Michael S. Tsirkin" Date: Wed, 7 Dec 2011 18:10:02 +0200 > On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote: >> From: Krishna Kumar2 >> Date: Fri, 25 Nov 2011 09:39:11 +0530 >> >> > Jason Wang wrote on 11/25/2011 08:51:57 AM: >> >> >> >> My description is not clear again :( >> >> I mean the same vhost thead: >> >> >> >> vhost thread #0 transmits packets of flow A on processor M >> >> ... >> >> vhost thread #0 move to another process N and start to transmit packets >> >> of flow A >> > >> > Thanks for clarifying. Yes, binding vhosts to CPU's >> > makes the incoming packet go to the same vhost each >> > time. BTW, are you doing any binding and/or irqbalance >> > when you run your tests? I am not running either at >> > this time, but thought both might be useful. >> >> So are we going with this patch or are we saying that vhost binding >> is a requirement? > > OK we didn't come to a conclusion so I would be inclined > to merge this patch as is for 3.2, and revisit later. > One question though: do these changes affect userspace > in any way? For example, will this commit us to > ensure that a single flow gets a unique hash even > for strange configurations that transmit the same flow > from multiple cpus? Once you sort this out, reply with an Acked-by: for me, thanks. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote: > From: Krishna Kumar2 > Date: Fri, 25 Nov 2011 09:39:11 +0530 > > > Jason Wang wrote on 11/25/2011 08:51:57 AM: > >> > >> My description is not clear again :( > >> I mean the same vhost thead: > >> > >> vhost thread #0 transmits packets of flow A on processor M > >> ... > >> vhost thread #0 move to another process N and start to transmit packets > >> of flow A > > > > Thanks for clarifying. Yes, binding vhosts to CPU's > > makes the incoming packet go to the same vhost each > > time. BTW, are you doing any binding and/or irqbalance > > when you run your tests? I am not running either at > > this time, but thought both might be useful. > > So are we going with this patch or are we saying that vhost binding > is a requirement? OK we didn't come to a conclusion so I would be inclined to merge this patch as is for 3.2, and revisit later. One question though: do these changes affect userspace in any way? For example, will this commit us to ensure that a single flow gets a unique hash even for strange configurations that transmit the same flow from multiple cpus? -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On Mon, 28 Nov 2011 12:25:15 +0800 Jason Wang wrote: > I'm using ixgbe for testing also, for host, its driver seems provide irq > affinity hint, so no binding or irqbalance is needed. The hint is for irqbalance to use. You need to still do manual affinity or use irqbalance. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On 11/28/2011 01:23 AM, Michael S. Tsirkin wrote: On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote: From: Krishna Kumar2 Date: Fri, 25 Nov 2011 09:39:11 +0530 Jason Wang wrote on 11/25/2011 08:51:57 AM: My description is not clear again :( I mean the same vhost thead: vhost thread #0 transmits packets of flow A on processor M ... vhost thread #0 move to another process N and start to transmit packets of flow A Thanks for clarifying. Yes, binding vhosts to CPU's makes the incoming packet go to the same vhost each time. BTW, are you doing any binding and/or irqbalance when you run your tests? I am not running either at this time, but thought both might be useful. So are we going with this patch or are we saying that vhost binding is a requirement? I think it's a good idea to make sure we understand the problem root cause well before applying the patch. We still have a bit of time before 3.2. In particular, why does the vhost thread bounce between CPUs so much? Other than this, since we could not assume the behavior of the under nic, using rxhash to identify a flow is more generic way. Long term it seems the best way is to expose the preferred mapping from the guest and forward it to the device. I was working on this and hope to post it soon. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On 11/25/2011 12:09 PM, Krishna Kumar2 wrote: Jason Wang wrote on 11/25/2011 08:51:57 AM: My description is not clear again :( I mean the same vhost thead: vhost thread #0 transmits packets of flow A on processor M ... vhost thread #0 move to another process N and start to transmit packets of flow A Thanks for clarifying. Yes, binding vhosts to CPU's makes the incoming packet go to the same vhost each time. BTW, are you doing any binding and/or irqbalance when you run your tests? I am not running either at this time, but thought both might be useful. I'm using ixgbe for testing also, for host, its driver seems provide irq affinity hint, so no binding or irqbalance is needed. For guest, irqbalance is used in guest. I've tried bind irq in guest, and it can improve the rx performance. - KK -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On Fri, Nov 25, 2011 at 01:35:52AM -0500, David Miller wrote: > From: Krishna Kumar2 > Date: Fri, 25 Nov 2011 09:39:11 +0530 > > > Jason Wang wrote on 11/25/2011 08:51:57 AM: > >> > >> My description is not clear again :( > >> I mean the same vhost thead: > >> > >> vhost thread #0 transmits packets of flow A on processor M > >> ... > >> vhost thread #0 move to another process N and start to transmit packets > >> of flow A > > > > Thanks for clarifying. Yes, binding vhosts to CPU's > > makes the incoming packet go to the same vhost each > > time. BTW, are you doing any binding and/or irqbalance > > when you run your tests? I am not running either at > > this time, but thought both might be useful. > > So are we going with this patch or are we saying that vhost binding > is a requirement? I think it's a good idea to make sure we understand the problem root cause well before applying the patch. We still have a bit of time before 3.2. In particular, why does the vhost thread bounce between CPUs so much? Long term it seems the best way is to expose the preferred mapping from the guest and forward it to the device. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On Fri, Nov 25, 2011 at 09:39:11AM +0530, Krishna Kumar2 wrote: > Jason Wang wrote on 11/25/2011 08:51:57 AM: > > > > My description is not clear again :( > > I mean the same vhost thead: > > > > vhost thread #0 transmits packets of flow A on processor M > > ... > > vhost thread #0 move to another process N and start to transmit packets > > of flow A > > Thanks for clarifying. Yes, binding vhosts to CPU's > makes the incoming packet go to the same vhost each > time. Interesting, but still not sure why. What if you bind the VCPUs but not the vhost thread? > BTW, are you doing any binding and/or irqbalance > when you run your tests? I am not running either at > this time, but thought both might be useful. > > - KK Either pinning or irqbalance is a good idea. Doing neither means you get a random CPU handling interrupts. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
From: Krishna Kumar2 Date: Fri, 25 Nov 2011 09:39:11 +0530 > Jason Wang wrote on 11/25/2011 08:51:57 AM: >> >> My description is not clear again :( >> I mean the same vhost thead: >> >> vhost thread #0 transmits packets of flow A on processor M >> ... >> vhost thread #0 move to another process N and start to transmit packets >> of flow A > > Thanks for clarifying. Yes, binding vhosts to CPU's > makes the incoming packet go to the same vhost each > time. BTW, are you doing any binding and/or irqbalance > when you run your tests? I am not running either at > this time, but thought both might be useful. So are we going with this patch or are we saying that vhost binding is a requirement? ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
Jason Wang wrote on 11/25/2011 08:51:57 AM: > > My description is not clear again :( > I mean the same vhost thead: > > vhost thread #0 transmits packets of flow A on processor M > ... > vhost thread #0 move to another process N and start to transmit packets > of flow A Thanks for clarifying. Yes, binding vhosts to CPU's makes the incoming packet go to the same vhost each time. BTW, are you doing any binding and/or irqbalance when you run your tests? I am not running either at this time, but thought both might be useful. - KK ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On 11/25/2011 11:07 AM, Krishna Kumar2 wrote: "Michael S. Tsirkin" wrote on 11/24/2011 09:44:31 PM: As far as I can see, ixgbe binds queues to physical cpu, so let consider: vhost thread transmits packets of flow A on processor M during packet transmission, ixgbe driver programs the card to deliver the packet of flow A to queue/cpu M through flow director (see ixgbe_atr()) vhost thread then receives packet of flow A with from M ... vhost thread transmits packets of flow A on processor N ixgbe driver programs the flow director to change the delivery of flow A to queue N ( cpu N ) vhost thread then receives packet of flow A with from N ... So, for a single flow A, we may get different queue mappings. Using rxhash instead may solve this issue. Or better, transmit a single flow from a single vhost thread. If packets of a single flow get spread over different CPUs, they will get reordered and things are not going to work well. My testing so far shows that guest sends on (e.g.) TXQ#2 only, which is handled by vhost#2; and this doesn't change for the entire duration of the test. Incoming keeps changing for different packets but become same with this patch. To iterate, I have not seen the following: Yes because guest chose the txq of virtio-net based on hash. " vhost thread transmits packets of flow A on processor M ... vhost thread transmits packets of flow A on processor N " My description is not clear again :( I mean the same vhost thead: vhost thread #0 transmits packets of flow A on processor M ... vhost thread #0 move to another process N and start to transmit packets of flow A - KK ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On 11/25/2011 10:58 AM, Krishna Kumar2 wrote: jasowang wrote on 11/24/2011 06:30:52 PM: On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote: It was reported that the macvtap device selects a different vhost (when used with multiqueue feature) for incoming packets of a single connection. Use packet hash first. Patch tested on MQ virtio_net. So this is sure to address the problem, why exactly does this happen? Does your device spread a single flow across multiple RX queues? Would not that cause trouble in the TCP layer? It would seem that using the recorded queue should be faster with less cache misses. Before we give up on that, I'd like to understand why it's wrong. Do you know? I am using ixgbe. From what I briefly saw, ixgbe_alloc_rx_buffers calls skb_record_rx_queue when a skb is allocated. When a packet is received (ixgbe_alloc_rx_buffers), it sets rxhash. The recorded value is different for most skbs when I ran a single stream TCP stream test (does skbs move between the rx_rings?). Yes, it moves. It depends on last processor or tx queue who transmits the packets of this stream. Because ixgbe select tx queue based on the processor id, so if vhost thread transmits skbs on different processors, the skb of a single stream may comes from different rx ring. But I don't see transmit going on different queues, only incoming. - KK Maybe I'm not clear enough, I mean the processor of host and tx queue of ixgbe. So you would see, for a single vhost thread, as it moves among host cpus, it would use different tx queues of ixgbe. I think if you pin the vhost thread on host cpu, you may get consistent rx queue no. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
"Michael S. Tsirkin" wrote on 11/24/2011 09:44:31 PM: > > As far as I can see, ixgbe binds queues to physical cpu, so let consider: > > > > vhost thread transmits packets of flow A on processor M > > during packet transmission, ixgbe driver programs the card to > > deliver the packet of flow A to queue/cpu M through flow director > > (see ixgbe_atr()) > > vhost thread then receives packet of flow A with from M > > ... > > vhost thread transmits packets of flow A on processor N > > ixgbe driver programs the flow director to change the delivery of > > flow A to queue N ( cpu N ) > > vhost thread then receives packet of flow A with from N > > ... > > > > So, for a single flow A, we may get different queue mappings. Using > > rxhash instead may solve this issue. > > Or better, transmit a single flow from a single vhost thread. > > If packets of a single flow get spread over different CPUs, > they will get reordered and things are not going to work well. My testing so far shows that guest sends on (e.g.) TXQ#2 only, which is handled by vhost#2; and this doesn't change for the entire duration of the test. Incoming keeps changing for different packets but become same with this patch. To iterate, I have not seen the following: " vhost thread transmits packets of flow A on processor M ... vhost thread transmits packets of flow A on processor N " - KK ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On 11/25/2011 12:14 AM, Michael S. Tsirkin wrote: On Thu, Nov 24, 2011 at 08:56:45PM +0800, jasowang wrote: > On 11/24/2011 06:34 PM, Michael S. Tsirkin wrote: > >On Thu, Nov 24, 2011 at 06:13:41PM +0800, jasowang wrote: > >>On 11/24/2011 05:59 PM, Michael S. Tsirkin wrote: > >>>On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote: > It was reported that the macvtap device selects a > different vhost (when used with multiqueue feature) > for incoming packets of a single connection. Use > packet hash first. Patch tested on MQ virtio_net. > >>>So this is sure to address the problem, why exactly does this happen? > >>Ixgbe has flow director and bind queue to host cpu, so it can make > >>sure the packet of a flow to be handled by the same queue/cpu. So > >>when vhost thread moves from one host cpu to another, ixgbe would > >>therefore send the packet to the new cpu/queue. > >Confused. How does ixgbe know about vhost thread moving? > > As far as I can see, ixgbe binds queues to physical cpu, so let consider: > > vhost thread transmits packets of flow A on processor M > during packet transmission, ixgbe driver programs the card to > deliver the packet of flow A to queue/cpu M through flow director > (see ixgbe_atr()) > vhost thread then receives packet of flow A with from M > ... > vhost thread transmits packets of flow A on processor N > ixgbe driver programs the flow director to change the delivery of > flow A to queue N ( cpu N ) > vhost thread then receives packet of flow A with from N > ... > > So, for a single flow A, we may get different queue mappings. Using > rxhash instead may solve this issue. Or better, transmit a single flow from a single vhost thread. It has already worked this way, as the tx queue were choose based on tx hash in guest(), but vhost thread can move among processors. If packets of a single flow get spread over different CPUs, they will get reordered and things are not going to work well. The problem is that vhost does not handle TCP itself but ixgbe driver would think it does, so the nic would deliver packets of a single flow to different CPUs when the vhost thread who does the transmission moves. So, in conclusion, if we do not consider features of under layer nic, using rxhash instead of queue mappings to identify a flow is better. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
jasowang wrote on 11/24/2011 06:30:52 PM: > > >> On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote: > >>> It was reported that the macvtap device selects a > >>> different vhost (when used with multiqueue feature) > >>> for incoming packets of a single connection. Use > >>> packet hash first. Patch tested on MQ virtio_net. > >> So this is sure to address the problem, why exactly does this happen? > >> Does your device spread a single flow across multiple RX queues? Would > >> not that cause trouble in the TCP layer? > >> It would seem that using the recorded queue should be faster with > >> less cache misses. Before we give up on that, I'd > >> like to understand why it's wrong. Do you know? > > I am using ixgbe. From what I briefly saw, ixgbe_alloc_rx_buffers > > calls skb_record_rx_queue when a skb is allocated. When a packet > > is received (ixgbe_alloc_rx_buffers), it sets rxhash. The > > recorded value is different for most skbs when I ran a single > > stream TCP stream test (does skbs move between the rx_rings?). > > Yes, it moves. It depends on last processor or tx queue who transmits > the packets of this stream. Because ixgbe select tx queue based on the > processor id, so if vhost thread transmits skbs on different processors, > the skb of a single stream may comes from different rx ring. But I don't see transmit going on different queues, only incoming. - KK ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On Thu, Nov 24, 2011 at 08:56:45PM +0800, jasowang wrote: > On 11/24/2011 06:34 PM, Michael S. Tsirkin wrote: > >On Thu, Nov 24, 2011 at 06:13:41PM +0800, jasowang wrote: > >>On 11/24/2011 05:59 PM, Michael S. Tsirkin wrote: > >>>On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote: > It was reported that the macvtap device selects a > different vhost (when used with multiqueue feature) > for incoming packets of a single connection. Use > packet hash first. Patch tested on MQ virtio_net. > >>>So this is sure to address the problem, why exactly does this happen? > >>Ixgbe has flow director and bind queue to host cpu, so it can make > >>sure the packet of a flow to be handled by the same queue/cpu. So > >>when vhost thread moves from one host cpu to another, ixgbe would > >>therefore send the packet to the new cpu/queue. > >Confused. How does ixgbe know about vhost thread moving? > > As far as I can see, ixgbe binds queues to physical cpu, so let consider: > > vhost thread transmits packets of flow A on processor M > during packet transmission, ixgbe driver programs the card to > deliver the packet of flow A to queue/cpu M through flow director > (see ixgbe_atr()) > vhost thread then receives packet of flow A with from M > ... > vhost thread transmits packets of flow A on processor N > ixgbe driver programs the flow director to change the delivery of > flow A to queue N ( cpu N ) > vhost thread then receives packet of flow A with from N > ... > > So, for a single flow A, we may get different queue mappings. Using > rxhash instead may solve this issue. Or better, transmit a single flow from a single vhost thread. If packets of a single flow get spread over different CPUs, they will get reordered and things are not going to work well. > > > >>>Does your device spread a single flow across multiple RX queues? Would > >>>not that cause trouble in the TCP layer? > >>>It would seem that using the recorded queue should be faster with > >>>less cache misses. Before we give up on that, I'd > >>>like to understand why it's wrong. Do you know? > >>> > Signed-off-by: Krishna Kumar > --- > drivers/net/macvtap.c | 16 > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c > --- org/drivers/net/macvtap.c 2011-10-22 08:38:01.0 +0530 > +++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.0 +0530 > @@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get > if (!numvtaps) > goto out; > > + /* Check if we can use flow to select a queue */ > + rxq = skb_get_rxhash(skb); > + if (rxq) { > + tap = rcu_dereference(vlan->taps[rxq % numvtaps]); > + if (tap) > + goto out; > + } > + > if (likely(skb_rx_queue_recorded(skb))) { > rxq = skb_get_rx_queue(skb); > > @@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get > goto out; > } > > - /* Check if we can use flow to select a queue */ > - rxq = skb_get_rxhash(skb); > - if (rxq) { > - tap = rcu_dereference(vlan->taps[rxq % numvtaps]); > - if (tap) > - goto out; > - } > - > /* Everything failed - find first available queue */ > for (rxq = 0; rxq< MAX_MACVTAP_QUEUES; rxq++) { > tap = rcu_dereference(vlan->taps[rxq]); > >>>-- > >>>To unsubscribe from this list: send the line "unsubscribe netdev" in > >>>the body of a message to majord...@vger.kernel.org > >>>More majordomo info at http://vger.kernel.org/majordomo-info.html > >-- > >To unsubscribe from this list: send the line "unsubscribe netdev" in > >the body of a message to majord...@vger.kernel.org > >More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On Thu, Nov 24, 2011 at 09:00:52PM +0800, jasowang wrote: > On 11/24/2011 07:14 PM, Krishna Kumar2 wrote: > >"Michael S. Tsirkin" wrote on 11/24/2011 03:29:03 PM: > > > >>Subject Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first > >> > >>On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote: > >>>It was reported that the macvtap device selects a > >>>different vhost (when used with multiqueue feature) > >>>for incoming packets of a single connection. Use > >>>packet hash first. Patch tested on MQ virtio_net. > >>So this is sure to address the problem, why exactly does this happen? > >>Does your device spread a single flow across multiple RX queues? Would > >>not that cause trouble in the TCP layer? > >>It would seem that using the recorded queue should be faster with > >>less cache misses. Before we give up on that, I'd > >>like to understand why it's wrong. Do you know? > >I am using ixgbe. From what I briefly saw, ixgbe_alloc_rx_buffers > >calls skb_record_rx_queue when a skb is allocated. When a packet > >is received (ixgbe_alloc_rx_buffers), it sets rxhash. The > >recorded value is different for most skbs when I ran a single > >stream TCP stream test (does skbs move between the rx_rings?). > > Yes, it moves. It depends on last processor or tx queue who > transmits the packets of this stream. Because ixgbe select tx queue > based on the processor id, so if vhost thread transmits skbs on > different processors, the skb of a single stream may comes from > different rx ring. Is that so? In that case, it looks like there's bug on the transmit side, causing skbs to get scattered between ixgbe tx queues, and the receive side trouble follows. > >With this patch, macvtap selects the same device for each > >incoming packet. I can add some debugs in ixgbe to see what is > >happening also. I am not sure if Sasha was using a different > >device. > > > >Cc'ing Jeffrey in case he can add something. > > > >thanks, > > > >- KK > > > >-- > >To unsubscribe from this list: send the line "unsubscribe netdev" in > >the body of a message to majord...@vger.kernel.org > >More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On 11/24/2011 07:14 PM, Krishna Kumar2 wrote: "Michael S. Tsirkin" wrote on 11/24/2011 03:29:03 PM: Subject Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote: It was reported that the macvtap device selects a different vhost (when used with multiqueue feature) for incoming packets of a single connection. Use packet hash first. Patch tested on MQ virtio_net. So this is sure to address the problem, why exactly does this happen? Does your device spread a single flow across multiple RX queues? Would not that cause trouble in the TCP layer? It would seem that using the recorded queue should be faster with less cache misses. Before we give up on that, I'd like to understand why it's wrong. Do you know? I am using ixgbe. From what I briefly saw, ixgbe_alloc_rx_buffers calls skb_record_rx_queue when a skb is allocated. When a packet is received (ixgbe_alloc_rx_buffers), it sets rxhash. The recorded value is different for most skbs when I ran a single stream TCP stream test (does skbs move between the rx_rings?). Yes, it moves. It depends on last processor or tx queue who transmits the packets of this stream. Because ixgbe select tx queue based on the processor id, so if vhost thread transmits skbs on different processors, the skb of a single stream may comes from different rx ring. With this patch, macvtap selects the same device for each incoming packet. I can add some debugs in ixgbe to see what is happening also. I am not sure if Sasha was using a different device. Cc'ing Jeffrey in case he can add something. thanks, - KK -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On 11/24/2011 06:34 PM, Michael S. Tsirkin wrote: On Thu, Nov 24, 2011 at 06:13:41PM +0800, jasowang wrote: On 11/24/2011 05:59 PM, Michael S. Tsirkin wrote: On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote: It was reported that the macvtap device selects a different vhost (when used with multiqueue feature) for incoming packets of a single connection. Use packet hash first. Patch tested on MQ virtio_net. So this is sure to address the problem, why exactly does this happen? Ixgbe has flow director and bind queue to host cpu, so it can make sure the packet of a flow to be handled by the same queue/cpu. So when vhost thread moves from one host cpu to another, ixgbe would therefore send the packet to the new cpu/queue. Confused. How does ixgbe know about vhost thread moving? As far as I can see, ixgbe binds queues to physical cpu, so let consider: vhost thread transmits packets of flow A on processor M during packet transmission, ixgbe driver programs the card to deliver the packet of flow A to queue/cpu M through flow director (see ixgbe_atr()) vhost thread then receives packet of flow A with from M ... vhost thread transmits packets of flow A on processor N ixgbe driver programs the flow director to change the delivery of flow A to queue N ( cpu N ) vhost thread then receives packet of flow A with from N ... So, for a single flow A, we may get different queue mappings. Using rxhash instead may solve this issue. Does your device spread a single flow across multiple RX queues? Would not that cause trouble in the TCP layer? It would seem that using the recorded queue should be faster with less cache misses. Before we give up on that, I'd like to understand why it's wrong. Do you know? Signed-off-by: Krishna Kumar --- drivers/net/macvtap.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c --- org/drivers/net/macvtap.c 2011-10-22 08:38:01.0 +0530 +++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.0 +0530 @@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get if (!numvtaps) goto out; + /* Check if we can use flow to select a queue */ + rxq = skb_get_rxhash(skb); + if (rxq) { + tap = rcu_dereference(vlan->taps[rxq % numvtaps]); + if (tap) + goto out; + } + if (likely(skb_rx_queue_recorded(skb))) { rxq = skb_get_rx_queue(skb); @@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get goto out; } - /* Check if we can use flow to select a queue */ - rxq = skb_get_rxhash(skb); - if (rxq) { - tap = rcu_dereference(vlan->taps[rxq % numvtaps]); - if (tap) - goto out; - } - /* Everything failed - find first available queue */ for (rxq = 0; rxq< MAX_MACVTAP_QUEUES; rxq++) { tap = rcu_dereference(vlan->taps[rxq]); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
"Michael S. Tsirkin" wrote on 11/24/2011 03:29:03 PM: > Subject Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first > > On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote: > > It was reported that the macvtap device selects a > > different vhost (when used with multiqueue feature) > > for incoming packets of a single connection. Use > > packet hash first. Patch tested on MQ virtio_net. > > So this is sure to address the problem, why exactly does this happen? > Does your device spread a single flow across multiple RX queues? Would > not that cause trouble in the TCP layer? > It would seem that using the recorded queue should be faster with > less cache misses. Before we give up on that, I'd > like to understand why it's wrong. Do you know? I am using ixgbe. From what I briefly saw, ixgbe_alloc_rx_buffers calls skb_record_rx_queue when a skb is allocated. When a packet is received (ixgbe_alloc_rx_buffers), it sets rxhash. The recorded value is different for most skbs when I ran a single stream TCP stream test (does skbs move between the rx_rings?). With this patch, macvtap selects the same device for each incoming packet. I can add some debugs in ixgbe to see what is happening also. I am not sure if Sasha was using a different device. Cc'ing Jeffrey in case he can add something. thanks, - KK ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On Thu, Nov 24, 2011 at 06:13:41PM +0800, jasowang wrote: > On 11/24/2011 05:59 PM, Michael S. Tsirkin wrote: > >On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote: > >>It was reported that the macvtap device selects a > >>different vhost (when used with multiqueue feature) > >>for incoming packets of a single connection. Use > >>packet hash first. Patch tested on MQ virtio_net. > >So this is sure to address the problem, why exactly does this happen? > > Ixgbe has flow director and bind queue to host cpu, so it can make > sure the packet of a flow to be handled by the same queue/cpu. So > when vhost thread moves from one host cpu to another, ixgbe would > therefore send the packet to the new cpu/queue. Confused. How does ixgbe know about vhost thread moving? > >Does your device spread a single flow across multiple RX queues? Would > >not that cause trouble in the TCP layer? > >It would seem that using the recorded queue should be faster with > >less cache misses. Before we give up on that, I'd > >like to understand why it's wrong. Do you know? > > > >>Signed-off-by: Krishna Kumar > >>--- > >> drivers/net/macvtap.c | 16 > >> 1 file changed, 8 insertions(+), 8 deletions(-) > >> > >>diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c > >>--- org/drivers/net/macvtap.c 2011-10-22 08:38:01.0 +0530 > >>+++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.0 +0530 > >>@@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get > >>if (!numvtaps) > >>goto out; > >> > >>+ /* Check if we can use flow to select a queue */ > >>+ rxq = skb_get_rxhash(skb); > >>+ if (rxq) { > >>+ tap = rcu_dereference(vlan->taps[rxq % numvtaps]); > >>+ if (tap) > >>+ goto out; > >>+ } > >>+ > >>if (likely(skb_rx_queue_recorded(skb))) { > >>rxq = skb_get_rx_queue(skb); > >> > >>@@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get > >>goto out; > >>} > >> > >>- /* Check if we can use flow to select a queue */ > >>- rxq = skb_get_rxhash(skb); > >>- if (rxq) { > >>- tap = rcu_dereference(vlan->taps[rxq % numvtaps]); > >>- if (tap) > >>- goto out; > >>- } > >>- > >>/* Everything failed - find first available queue */ > >>for (rxq = 0; rxq< MAX_MACVTAP_QUEUES; rxq++) { > >>tap = rcu_dereference(vlan->taps[rxq]); > >-- > >To unsubscribe from this list: send the line "unsubscribe netdev" in > >the body of a message to majord...@vger.kernel.org > >More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On 11/24/2011 05:59 PM, Michael S. Tsirkin wrote: On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote: It was reported that the macvtap device selects a different vhost (when used with multiqueue feature) for incoming packets of a single connection. Use packet hash first. Patch tested on MQ virtio_net. So this is sure to address the problem, why exactly does this happen? Ixgbe has flow director and bind queue to host cpu, so it can make sure the packet of a flow to be handled by the same queue/cpu. So when vhost thread moves from one host cpu to another, ixgbe would therefore send the packet to the new cpu/queue. Does your device spread a single flow across multiple RX queues? Would not that cause trouble in the TCP layer? It would seem that using the recorded queue should be faster with less cache misses. Before we give up on that, I'd like to understand why it's wrong. Do you know? Signed-off-by: Krishna Kumar --- drivers/net/macvtap.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c --- org/drivers/net/macvtap.c 2011-10-22 08:38:01.0 +0530 +++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.0 +0530 @@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get if (!numvtaps) goto out; + /* Check if we can use flow to select a queue */ + rxq = skb_get_rxhash(skb); + if (rxq) { + tap = rcu_dereference(vlan->taps[rxq % numvtaps]); + if (tap) + goto out; + } + if (likely(skb_rx_queue_recorded(skb))) { rxq = skb_get_rx_queue(skb); @@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get goto out; } - /* Check if we can use flow to select a queue */ - rxq = skb_get_rxhash(skb); - if (rxq) { - tap = rcu_dereference(vlan->taps[rxq % numvtaps]); - if (tap) - goto out; - } - /* Everything failed - find first available queue */ for (rxq = 0; rxq< MAX_MACVTAP_QUEUES; rxq++) { tap = rcu_dereference(vlan->taps[rxq]); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On Thu, Nov 24, 2011 at 01:47:14PM +0530, Krishna Kumar wrote: > It was reported that the macvtap device selects a > different vhost (when used with multiqueue feature) > for incoming packets of a single connection. Use > packet hash first. Patch tested on MQ virtio_net. So this is sure to address the problem, why exactly does this happen? Does your device spread a single flow across multiple RX queues? Would not that cause trouble in the TCP layer? It would seem that using the recorded queue should be faster with less cache misses. Before we give up on that, I'd like to understand why it's wrong. Do you know? > > Signed-off-by: Krishna Kumar > --- > drivers/net/macvtap.c | 16 > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c > --- org/drivers/net/macvtap.c 2011-10-22 08:38:01.0 +0530 > +++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.0 +0530 > @@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get > if (!numvtaps) > goto out; > > + /* Check if we can use flow to select a queue */ > + rxq = skb_get_rxhash(skb); > + if (rxq) { > + tap = rcu_dereference(vlan->taps[rxq % numvtaps]); > + if (tap) > + goto out; > + } > + > if (likely(skb_rx_queue_recorded(skb))) { > rxq = skb_get_rx_queue(skb); > > @@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get > goto out; > } > > - /* Check if we can use flow to select a queue */ > - rxq = skb_get_rxhash(skb); > - if (rxq) { > - tap = rcu_dereference(vlan->taps[rxq % numvtaps]); > - if (tap) > - goto out; > - } > - > /* Everything failed - find first available queue */ > for (rxq = 0; rxq < MAX_MACVTAP_QUEUES; rxq++) { > tap = rcu_dereference(vlan->taps[rxq]); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
On 11/24/2011 04:17 PM, Krishna Kumar wrote: It was reported that the macvtap device selects a different vhost (when used with multiqueue feature) for incoming packets of a single connection. Use packet hash first. Patch tested on MQ virtio_net. Signed-off-by: Krishna Kumar --- drivers/net/macvtap.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c --- org/drivers/net/macvtap.c 2011-10-22 08:38:01.0 +0530 +++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.0 +0530 @@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get if (!numvtaps) goto out; + /* Check if we can use flow to select a queue */ + rxq = skb_get_rxhash(skb); + if (rxq) { + tap = rcu_dereference(vlan->taps[rxq % numvtaps]); + if (tap) + goto out; + } + if (likely(skb_rx_queue_recorded(skb))) { rxq = skb_get_rx_queue(skb); Looks reasonable and we can improve this further by implementing smarter queue selection with the co-operation of guest. I think one initiate of this is to let the packets of a flow to be handled by the same guest cpu/vhost thread. This can be done by using accelerate rfs in virtio-net driver and 'tell' the macvtap/tap which queue should a packet go. I've started working on prototype of this, it should solves the issue of packet steering in guest. One more thought on this is, If a nic (such as ixgbe) with flow director or similar technology is used, it can make sure the packet belongs to a flow to be handled by host cpu. If we can make use of this feature, more cache locality would be gained. @@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get goto out; } - /* Check if we can use flow to select a queue */ - rxq = skb_get_rxhash(skb); - if (rxq) { - tap = rcu_dereference(vlan->taps[rxq % numvtaps]); - if (tap) - goto out; - } - /* Everything failed - find first available queue */ for (rxq = 0; rxq< MAX_MACVTAP_QUEUES; rxq++) { tap = rcu_dereference(vlan->taps[rxq]); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH] macvtap: Fix macvtap_get_queue to use rxhash first
It was reported that the macvtap device selects a different vhost (when used with multiqueue feature) for incoming packets of a single connection. Use packet hash first. Patch tested on MQ virtio_net. Signed-off-by: Krishna Kumar --- drivers/net/macvtap.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff -ruNp org/drivers/net/macvtap.c new/drivers/net/macvtap.c --- org/drivers/net/macvtap.c 2011-10-22 08:38:01.0 +0530 +++ new/drivers/net/macvtap.c 2011-11-16 18:34:51.0 +0530 @@ -175,6 +175,14 @@ static struct macvtap_queue *macvtap_get if (!numvtaps) goto out; + /* Check if we can use flow to select a queue */ + rxq = skb_get_rxhash(skb); + if (rxq) { + tap = rcu_dereference(vlan->taps[rxq % numvtaps]); + if (tap) + goto out; + } + if (likely(skb_rx_queue_recorded(skb))) { rxq = skb_get_rx_queue(skb); @@ -186,14 +194,6 @@ static struct macvtap_queue *macvtap_get goto out; } - /* Check if we can use flow to select a queue */ - rxq = skb_get_rxhash(skb); - if (rxq) { - tap = rcu_dereference(vlan->taps[rxq % numvtaps]); - if (tap) - goto out; - } - /* Everything failed - find first available queue */ for (rxq = 0; rxq < MAX_MACVTAP_QUEUES; rxq++) { tap = rcu_dereference(vlan->taps[rxq]); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization