Hi Bhanu,

Regards
_Sugesh

> -----Original Message-----
> From: Bodireddy, Bhanuprakash
> Sent: Monday, November 27, 2017 4:35 PM
> To: 'Aaron Conole' <acon...@redhat.com>
> Cc: 'd...@openvswitch.org' <d...@openvswitch.org>; Ben Pfaff <b...@ovn.org>;
> Chandran, Sugesh <sugesh.chand...@intel.com>
> Subject: RE: [ovs-dev] [PATCH] packets: Prefetch the packet metadata in
> cacheline1.
> 
> >>Bhanuprakash Bodireddy <bhanuprakash.bodire...@intel.com> writes:
> >>
> >>> pkt_metadata_prefetch_init() is used to prefetch the packet metadata
> >>> before initializing the metadata in pkt_metadata_init(). This is
> >>> done for every packet in userspace datapath and is performance critical.
> >>>
> >>> Commit 99fc16c0 prefetches only cachline0 and cacheline2 as the
> >>> metadata part of respective cachelines will be initialized by
> >>pkt_metadata_init().
> >>>
> >>> However in VXLAN case when popping the vxlan header,
> >>> netdev_vxlan_pop_header() invokes pkt_metadata_init_tnl() which
> >>> zeroes out metadata part of
> >>> cacheline1 that wasn't prefetched earlier and causes performance
> >>> degradation.
> >>>
> >>> By prefetching cacheline1, 9% performance improvement is observed.
> >>
> >>Do we see a degredation in the non-vxlan case?  If not, then I don't
> >>see any reason not to apply this patch.
> >
> >This patch doesn't impact the performance of non-vxlan cases and only
> >have a positive impact in vxlan case.
> 
> The commit message claims that the performance improvement was 9% with
> this patch but when Sugesh was checking he wasn't getting that performance
> improvement on his Haswell.
> 
> I was chatting to Sugesh this afternoon on this patch and we found some
> interesting details and much of this boils down to how the OvS is built .( 
> Apart
> from HW, BIOS settings - TB disabled).
> 
> The test case here measure the VXLAN de capsulation performance alone for
> packet sizes of 118 bytes.
> The OvS CFLAGS and throughput numbers are as below.
> 
> CFLAGS="-O2"
>     Master              4.667 Mpps
>     With Patch       5.045 Mpps
> 
> CFLAGS="-O2 -msse4.2"
>     Master              4.710 Mpps
>     With Patch       5.097 Mpps
> 
> CFLAGS="-O2 -march=native"
>     Master              5.072 Mpps
>     With Patch       5.193 Mpps
> 
> CFLAGS="-Ofast -march=native"
>     Master              5.349 Mpps
>     With Patch       5.378 Mpps
> 
> This means the performance measurements/claims are difficult to assess and as
> one can see above with "-Ofast, -march=native"
> the improvement is insignificant but this is very platform dependent due to
> "march=native" flag. Also the optimization flags seems to make significant
> difference.
[Sugesh] I also tested on my board with same set of configuration and getting 
the same result as yours.
So this patch offers performance improvement based on the compiler option. I am 
not sure whats the most preferred/used 
compiler option out there.
I always build OVS with CFLAGS="-Ofast -march=native" and the patch doesn't 
have a great improvement in it.

I don't mind Acking the patch, if you could re-send the patch with these 
results and options in the commit message. 
Atleast it will offer performance improvement for other build options.

> 
> - Bhanuprakash.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to