Hi Don, 2022-07-26 14:33 (UTC-0400), Don Wallwork: > This proposal describes a method for translating any huge page > address from virtual to physical or vice versa using simple > addition or subtraction of a single fixed value. This allows > devices to efficiently access arbitrary huge page memory, even > stack data when worker stacks are in huge pages.
What is the use case and how much is the benefit? When drivers need to process a large number of memory blocks, these are typically packets in the form of mbufs, which already have IOVA attached, so there is no translation. Does translation of mbuf VA to PA with the proposed method show significant improvement over reading mbuf->iova? When drivers need to process a few IOVA-contiguous memory blocks, they can calculate VA-to-PA offsets in advance, amortizing translation cost. Hugepage stack falls within this category. > When legacy memory mode is used, it is possible to map a single > virtual memory region large enough to cover all huge pages. During > legacy hugepage init, each hugepage is mapped into that region. Legacy mode is called "legacy" with an intent to be deprecated :) There is initial allocation (-m) and --socket-limit in dynamic mode. When initial allocation is equal to the socket limit, it should be the same behavior as in legacy mode: the number of hugepages mapped is constant and cannot grow, so the feature seems applicable as well. > Once all pages have been mapped, any unused holes in that memory > region are unmapped. Who tracks these holes and prevents translation from their VA? Why the holes appear? > This feature is applicable when rte_eal_iova_mode() == RTE_IOVA_PA One can say it always works for RTE_IOVA_VA with VA-to-PA offset of 0. > and could be enabled either by default when the legacy memory EAL > option is given, or a new EAL option could be added to specifically > enable this feature. > > It may be desirable to set a capability bit when this feature is > enabled to allow drivers to behave differently depending on the > state of that flag. The feature requires, in IOVA-as-PA mode: 1) that hugepage mapping is static (legacy mode or "-m" == "--socket-limit"); 2) that EAL has succeeded to map all hugepages in one PA-continuous block. As userspace code, DPDK cannot guarantee 2). Because this mode breaks nothing and just makes translation more efficient, DPDK can always try to implement it and then report whether it has succeeded. Applications and drivers can decide what to do by querying this API.