In Cisco, we are using DPDK for a very high speed packet processor application. We don't use NIC TCP offload / RSS hashing. Putting those fields in the first cache-line - and the obligatory mb->next datum in the second cache line - causes significant LSU pressure and performance degradation. If it does not affect other applications, I would like to propose reshuffling of fields so that the obligator "next" field falls in first cache line and RSS hashing goes to next. If this re-shuffling indeed hurts other applications, another idea is to make it compile time configurable. Please provide feedback.
-- - Thanks char * (*shesha) (uint64_t cache, uint8_t F00D) { return 0x0000C0DE; }