Ping to others on the mailing list for opinions on this. What does "native" VPP+DPDK get and how is this problem solved there?
On Thu, Dec 7, 2017 at 11:55 AM, Michal Mazur <michal.ma...@linaro.org> wrote: > The _odp_packet_inline is common for all packets and takes up to two > cachelines (it contains only offsets). Reading pointer for each packet from > VLIB would require to fetch 10 million cachelines per second. > Using prefetches does not help. > > On 7 December 2017 at 18:37, Bill Fischofer <bill.fischo...@linaro.org> > wrote: > >> Yes, but _odp_packet_inline.udate is clearly not in the VLIB cache line >> either, so it's a separate cache line access. Are you seeing this >> difference in real runs or microbenchmarks? Why isn't the entire VLIB being >> prefetched at dispatch? Sequential prefetching should add negligible >> overhead. >> >> On Thu, Dec 7, 2017 at 11:13 AM, Michal Mazur <michal.ma...@linaro.org> >> wrote: >> >>> It seems that only first cache line of VLIB buffer is in L1, new pointer >>> can be placed only in second cacheline. >>> Using constant offset between user area and ODP header i get 11 Mpps, >>> with pointer stored in VLIB buffer only 10Mpps and with this new api >>> 10.6Mpps. >>> >>> On 7 December 2017 at 18:04, Bill Fischofer <bill.fischo...@linaro.org> >>> wrote: >>> >>>> How would calling an API be better than referencing the stored data >>>> yourself? A cache line reference is a cache line reference, and presumably >>>> the VLIB buffer is already in L1 since it's your active data. >>>> >>>> On Thu, Dec 7, 2017 at 10:45 AM, Michal Mazur <michal.ma...@linaro.org> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> For odp4vpp plugin we need a new API function which, given user area >>>>> pointer, will return a pointer to ODP packet buffer. It is needed when >>>>> packets processed by VPP are sent back to ODP and only a pointer to >>>>> VLIB >>>>> buffer data (stored inside user area) is known. >>>>> >>>>> I have tried to store the ODP buffer pointer in VLIB data but reading >>>>> it >>>>> for every packet lowers performance by 800kpps. >>>>> >>>>> For odp-dpdk implementation it can look like: >>>>> /** @internal Inline function @param uarea @return */ >>>>> static inline odp_packet_t _odp_packet_from_user_area(void *uarea) >>>>> { >>>>> return (odp_packet_t)((uintptr_t)uarea - >>>>> _odp_packet_inline.udata); >>>>> } >>>>> >>>>> Please let me know what you think. >>>>> >>>>> Thanks, >>>>> Michal >>>>> >>>> >>>> >>> >> >