Hi,
when I read the dpdk dlb
eventdev driver code, I find that it used the cldemote instruction in the
dlb_recv_qe(). But I don't understand why it used there?
The cldemote instruction means to
move the cache line to the more remote cache, which helps to accelerate
core-to-core communication. But who will be use the memory of cache_line_base?
>static __rte_always_inline int dlb_recv_qe(struct dlb_port *qm_port, struct
dlb_dequeue_qe *qe, uint8_t *offset)
>{
>
> cq_addr =
dlb_port[qm_port->id][PORT_TYPE(qm_port)].cq_base;
> cq_addr =
&cq_addr[qm_port->cq_idx];
> cache_line_base = (void
*)(((uintptr_t)cq_addr) & ~0x3F);
> *offset = ((uintptr_t)cq_addr
& 0x30) >> 4;
> /* Load the next CQ cache line
from memory. Pack these reads as tight
> * as possible to reduce the
chance that DLB invalidates the line while
> * the CPU is reading it. Read
the cache line backwards to ensure that
> * if QE[N] (N > 0) is valid,
then QEs[0:N-1] are too.
> *
> * (Valid QEs start at
&qe[offset])
> */
> qes[3] = _mm_load_si128((__m128i
*)&cache_line_base[6]);
> qes[2] = _mm_load_si128((__m128i
*)&cache_line_base[4]);
> qes[1] = _mm_load_si128((__m128i
*)&cache_line_base[2]);
> qes[0] = _mm_load_si128((__m128i
*)&cache_line_base[0]);
>
> /* Evict the cache line ASAP */
> rte_cldemote(cache_line_base);
Thanks.