Hi,
2015-09-24 22:10, Arnon Warshavsky:
> Moving from dpdk 1.5 to 2.0 we observed a PPS performance degradation of
> ~30%.
> After chasing this one for a while we found the problem:
>
> A) Between the 2 versions rte_mbuf was increased in size from 1 to 2 cache
> lines.
> B) The standard (non-vector) rx function does not perform a prefetch for
> the 2nd cache line of the mbuf (I see this bug exists in 2.1 as well) and
> it touches it setting the next pointer to NULL.
> I tested it in ixgbe, but it looks like it exists in all drivers in the
> *_rx_recv_pkts() and *_rx_recv_scattered_pkts() functions.
> Once added the prefetch for the 2nd line, we were back in our previous
> numbers.
>
> I believe this one slipped under the radar as the vector mode is now the
> default.
> We stumbled into it because we work in non-vector mode due to a different
> mempool bug in 2.0 which sometimes crashes the application upon port stop.
Big thanks for this double bug report!
> I have 2 questions
> 1)
> Could anyone tell if the regression tests are comparing performance while
> building DPDK with the default set of flags alone, or are multiple options
> examined?
There is no official regression test of performance.
Though Intel is probably monitoring it for their hardware.
By the way, it would be a good improvement to have such standard benchmark
in DTS or elsewhere.
> 2)
> How are issues like that being tracked and later associated to a patch?
In general, it is followed by discussion and a patch on this mailing list.
The patch must track the fixed issue in the release notes.
In order to give better exposure of current bugs we could instantiate a
bug tracker. I think it's time to think about it seriously. Let's discuss
about the possible solutions in another thread.
Thanks again to you and all the Qwilt team.
PS: it would be nice to hear about your DPDK deployment and results