On 09/06/2017 02:33 PM, Jan Scheurich wrote: > Hi Billy, > >> You are going to have to take the hit crossing the NUMA boundary at some >> point if your NIC and VM are on different NUMAs. >> >> So are you saying that it is more expensive to cross the NUMA boundary from >> the pmd to the VM that to cross it from the NIC to the >> PMD? > > Indeed, that is the case: If the NIC crosses the QPI bus when storing packets > in the remote NUMA there is no cost involved for the PMD. (The QPI bandwidth > is typically not a bottleneck.) The PMD only performs local memory access. > > On the other hand, if the PMD crosses the QPI when copying packets into a > remote VM, there is a huge latency penalty involved, consuming lots of PMD > cycles that cannot be spent on processing packets. We at Ericsson have > observed exactly this behavior. > > This latency penalty becomes even worse when the LLC cache hit rate is > degraded due to LLC cache contention with real VNFs and/or unfavorable packet > buffer re-use patterns as exhibited by real VNFs compared to typical > synthetic benchmark apps like DPDK testpmd. > >> >> If so then in that case you'd like to have two (for example) PMDs polling 2 >> queues on the same NIC. With the PMDs on each of the >> NUMA nodes forwarding to the VMs local to that NUMA? >> >> Of course your NIC would then also need to be able know which VM (or at >> least which NUMA the VM is on) in order to send the frame >> to the correct rxq. > > That would indeed be optimal but hard to realize in the general case (e.g. > with VXLAN encapsulation) as the actual destination is only known after > tunnel pop. Here perhaps some probabilistic steering of RSS hash values based > on measured distribution of final destinations might help in the future. > > But even without that in place, we need PMDs on both NUMAs anyhow (for > NUMA-aware polling of vhostuser ports), so why not use them to also poll > remote eth ports. We can achieve better average performance with fewer PMDs > than with the current limitation to NUMA-local polling. >
If the user has some knowledge of the numa locality of ports and can place VM's accordingly, default cross-numa assignment can be harm performance. Also, it would make for very unpredictable performance from test to test and even for flow to flow on a datapath. Kevin. > BR, Jan > _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss