[dpdk-dev] rte_prefetch0() performance info

2015-03-05 Thread Parikshith Chowdaiah
Hi all, I have a question related to usage of rte_prefetch0() function,In one of the sample files, we have implementation like: /* Prefetch first packets */ for (j = 0; j < PREFETCH_OFFSET && j < nb_rx; j++) { rte_prefetch0(rte_pktmbuf_mtod(

[dpdk-dev] rte_prefetch0() performance info

2015-03-05 Thread Anuj Kalia
Hi Parikshith. A CPU core can have a limited number of prefetches in flight (around 10). So if you issue 64 (or nb_rx > 10) prefetches in quick succession, you'll stall on memory access. The main idea here is to overlap prefetches of some packets with computation from other packets. This paper ex