On Thu, Feb 16, 2017 at 11:27 PM, maowenan <maowe...@huawei.com> wrote: > > >> -----Original Message----- >> From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] >> On Behalf Of Jeff Kirsher >> Sent: Thursday, February 16, 2017 8:51 PM >> To: da...@davemloft.net >> Cc: Alexander Duyck; netdev@vger.kernel.org; nhor...@redhat.com; >> sassm...@redhat.com; jogre...@redhat.com; Jeff Kirsher >> Subject: [net-next 06/14] ixgbe: Update driver to make use of DMA attributes >> in >> Rx path >> >> From: Alexander Duyck <alexander.h.du...@intel.com> >> >> This patch adds support for DMA_ATTR_SKIP_CPU_SYNC and >> DMA_ATTR_WEAK_ORDERING. By enabling both of these for the Rx path we >> are able to see performance improvements on architectures that implement >> either one due to the fact that page mapping and unmapping only has to sync >> what is actually being used instead of the entire buffer. In addition by >> enabling the weak ordering attribute enables a performance improvement for >> architectures that can associate a memory ordering with a DMA buffer such as >> Sparc. >> >> Signed-off-by: Alexander Duyck <alexander.h.du...@intel.com> >> Tested-by: Andrew Bowers <andrewx.bow...@intel.com> >> Signed-off-by: Jeff Kirsher <jeffrey.t.kirs...@intel.com> >> --- >> drivers/net/ethernet/intel/ixgbe/ixgbe.h | 3 ++ >> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 56 >> ++++++++++++++++++--------- >> 2 files changed, 40 insertions(+), 19 deletions(-)
<snip> > > Hi Alex, > Is this patch available for arm64? I remember that it needs IOMMU support, > right? I assume you are talking about the DMA_ATTR_WEAK_ORDERING DMA attribute, and no, it is not available for arm64 last I knew. It is related to IOMMU specific attributes available on some SPARC and PowerPC Cell architectures. The bit that provides any additional througput on arm related to these patches is the fact that we sync only the size of the buffer received now instead of the entire buffer, and that we are only syncing the region that will be written to by the device when we perform the sync for the device. On architectures with non-trivial sync operations restricting the size to only what is needed should result in a nice bump up in performance. - Alex