> -----Original Message-----
> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Panu Matilainen
> Sent: Wednesday, June 24, 2015 9:33 AM
> To: Pravin Shelar; Jesse Gross
> Cc: dev@openvswitch.org; Flavio Leitner
> Subject: Re: [ovs-dev] [PATCH] dpif-netdev: Check for PKT_RX_RSS_HASH flag.
> 
> On 06/24/2015 05:06 AM, Pravin Shelar wrote:
> > On Tue, Jun 23, 2015 at 2:51 PM, Jesse Gross <je...@nicira.com> wrote:
> >> On Mon, Jun 22, 2015 at 8:08 PM, Pravin Shelar <pshe...@nicira.com> wrote:
> >>> On Fri, Jun 19, 2015 at 11:24 AM, Daniele Di Proietto
> >>> <diproiet...@vmware.com> wrote:
> >>>>
> >>>>
> >>>> On 18/06/2015 23:57, "Traynor, Kevin" <kevin.tray...@intel.com> wrote:
> >>>>
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>
> >>>>>> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Daniele Di
> >>>>>
> >>>>>> Proietto
> >>>>>
> >>>>>> Sent: Tuesday, June 16, 2015 7:39 PM
> >>>>>
> >>>>>> To: dev@openvswitch.org
> >>>>>
> >>>>>> Subject: [ovs-dev] [PATCH] dpif-netdev: Check for PKT_RX_RSS_HASH
> flag.
> >>>>>
> >>>>>>
> >>>>>
> >>>>>> DPDK mbufs contain a valid RSS hash only if PKT_RX_RSS_HASH is
> >>>>>
> >>>>>> set in 'ol_flags'.  Otherwise the hash is garbage and doesn't
> >>>>>
> >>>>>> relate to the packet.
> >>>>>
> >>>>>>
> >>>>>
> >>>>>> This fixes an issue with vhost, which, being a virtual NIC, doesn't
> >>>>>
> >>>>>> compute the hash.
> >>>>>
> >>>>>>
> >>>>>
> >>>>>> Unfortunately the ixgbe vPMD doesn't set the PKT_RX_RSS_HASH, forcing
> >>>>>
> >>>>>> OVS to compute an hash is software.  This has a significant impact on
> >>>>>
> >>>>>> performance (-30% throughput in a single flow setup) which can be
> >>>>>
> >>>>>> mitigated in the CPU supports crc32c instructions.
> >>>>>
> >>>>>
> >>>>>
> >>>>> As per the other thread on this I'm a bit concerned about the
> performance
> >>>>>
> >>>>> drop from this patch, so I did some testing of this and alternative/
> >>>>>
> >>>>> complimentary solutions.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Here's the options I looked at and some comments:
> >>>>>
> >>>>> 1. This patch in isolation: vhost drops about ~15% vhost-vhost and
> >>>>>
> >>>>> phy-vhost-phy (because of sw hash) but also there is drops of ~25% for
> >>>>>
> >>>>> phy-phy and ~15% drop for phy-ivshmem-phy.
> >>>>>
> >>>>>
> >>>>>
> >>>>> 2. Leave the code as is and let EMC misses happen for vhost rx pkts:
> >>>>>
> >>>>> I measure this at ~35% drop if missed *everytime* for vhost-vhost. We
> >>>>>
> >>>>> see in testing that it can also never happen, but this is not
> realistic.
> >>>>>
> >>>>> There should be no impact to other DPDK interfaces.
> >>>>>
> >>>>>
> >>>>>
> >>>>> 3. Add hash reset for packets from vhost: This is another way of
> forcing
> >>>>>
> >>>>> the software hash for vhost rx and it is roughly equivalent in
> performance
> >>>>>
> >>>>> to 1. for vhost-vhost (~15% drop). While there is a no significant drop
> >>>>>
> >>>>> for phy-vhost-phy. There should be no impact to other DPDK interfaces.
> >>>>>
> >>>>>
> >>>>>
> >>>>> 4. Apply this patch and turn off Rx Vectorisation. vhost-vhost will
> drop
> >>>>>
> >>>>> ~15% as per 1. and there should be nothing significant for phy-vhost-
> phy.
> >>>>>
> >>>>> We would lose the 10% gain that rx vectorisation gave us for phy-phy.
> >>>>>
> >>>>> There should be no impact for dpdkr ports.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> In terms of not knowing whether the hw hash is valid or not if the flag
> is
> >>>>>
> >>>>> not checked, I would have expected the pmd to return an error on config
> if
> >>>>>
> >>>>> the hash wasn't supported, but I'm not sure that it does.
> >>>>>
> >>>>> In the worst case where there was an incorrect hash, it would miss the
> EMC
> >>>>>
> >>>>> which is about a 45% drop for phy-phy. I would think it's pretty safe
> that
> >>>>>
> >>>>> if we configure it, the hash will be correct but I guess there is a
> >>>>>
> >>>>> possibility it wouldn't be.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Even if it is possible to get a smaller patch to fix the underlying
> issue
> >>>>>
> >>>>> in DPDK, it would be in DPDK 2.1 at the earliest meaning the
> performance
> >>>>>
> >>>>> would remain low until sometime in August. If it's DPDK 2.2, then it
> would
> >>>>>
> >>>>> be sometime in December. This would mean any performance drops would be
> >>>>>
> >>>>> present in OVS 2.4 and possibly OVS 2.5.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Sorry :( but based on the performance drop with this patch in isolation
> it
> >>>>>
> >>>>> would be a NAK from me. My preference would be 3 which gives best
> >>>>> performance,
> >>>>>
> >>>>> or 4 which is a bit lower for phy-phy but safer.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Kevin.
> >>>>
> >>>> Thanks for all the testing.  I guess it might make sense to stretch our
> >>>> interpretation of the API in this case, because it wouldn't affect
> >>>> correctness.
> >>>>
> >>>> Unless there any other objection I'm fine with the 3rd approach.
> >>>>
> >>>
> >>> We can use 3rd approach to fix issue on branch 2.4. Then have patch to
> >>> check the PKT_RX_RSS_HASH flag on master. By the time we release
> >>> branch 2.5 we will have proper fix in DPDK and performance will bounce
> >>> back.
> >>
> >> I think this is probably a reasonable compromise. I think it's better
> >> to not keep a workaround in for an unbounded amount of time, otherwise
> >> we'll forget about it and it will come back to bite us in the future.
> >
> > ok, Once the DPDK fix is backported to DPDK 2.0, we can remove the
> workaround.
> 
> That's assuming there will be a DPDK 2.0.1 release, but I have seen no
> evidence of such plans in the DPDK camp.

I don't expect there will be a DPDK 2.0.1 release either. I'm optimistic we
can get a standalone patch to fix the issue in DPDK 2.1 which we will have
at the end of July. We could then roll DPDK 2.1 support into OVS master (and
presumably OVS 2.5).

The issue is fixed as part of the unified packet api changes but that won't
be available (by default) until DPDK 2.2, so obviously we would prefer not to
have to wait until then. 

> 
>       - Panu -
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to