With the patches applied to Zesty and booting with iommu.passthrough=1 kernel parameter we are able to see 35+Gbps BW on iperf testing.
ubuntu@awsdp1:~$ iperf -s 192.168.5.5 -w 900M ubuntu@awrep0:~$ iperf -c 192.168.5.5 -P 12 -w 900M ------------------------------------------------------------ Client connecting to 192.168.5.5, TCP port 5001 TCP window size: 1.76 GByte (WARNING: requested 900 MByte) ------------------------------------------------------------ [ 12] local 192.168.5.10 port 50216 connected with 192.168.5.5 port 5001 [ 4] local 192.168.5.10 port 50194 connected with 192.168.5.5 port 5001 [ 5] local 192.168.5.10 port 50196 connected with 192.168.5.5 port 5001 [ 7] local 192.168.5.10 port 50200 connected with 192.168.5.5 port 5001 [ 6] local 192.168.5.10 port 50198 connected with 192.168.5.5 port 5001 [ 8] local 192.168.5.10 port 50202 connected with 192.168.5.5 port 5001 [ 9] local 192.168.5.10 port 50204 connected with 192.168.5.5 port 5001 [ 3] local 192.168.5.10 port 50206 connected with 192.168.5.5 port 5001 [ 10] local 192.168.5.10 port 50208 connected with 192.168.5.5 port 5001 [ 11] local 192.168.5.10 port 50210 connected with 192.168.5.5 port 5001 [ 13] local 192.168.5.10 port 50212 connected with 192.168.5.5 port 5001 [ 14] local 192.168.5.10 port 50214 connected with 192.168.5.5 port 5001 [ ID] Interval Transfer Bandwidth [ 12] 0.0-10.0 sec 4.16 GBytes 3.58 Gbits/sec [ 4] 0.0-10.0 sec 2.72 GBytes 2.34 Gbits/sec [ 5] 0.0-10.0 sec 3.55 GBytes 3.05 Gbits/sec [ 7] 0.0-10.0 sec 3.97 GBytes 3.41 Gbits/sec [ 6] 0.0-10.0 sec 4.09 GBytes 3.52 Gbits/sec [ 8] 0.0-10.0 sec 3.61 GBytes 3.10 Gbits/sec [ 9] 0.0-10.0 sec 3.48 GBytes 2.99 Gbits/sec [ 3] 0.0-10.0 sec 3.43 GBytes 2.95 Gbits/sec [ 10] 0.0-10.0 sec 2.92 GBytes 2.51 Gbits/sec [ 11] 0.0-10.0 sec 4.06 GBytes 3.49 Gbits/sec [ 14] 0.0-10.0 sec 4.08 GBytes 3.50 Gbits/sec [ 13] 0.0-10.0 sec 2.71 GBytes 2.32 Gbits/sec [SUM] 0.0-10.0 sec 42.8 GBytes 36.7 Gbits/sec -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1705123 Title: [Zesty] Fixes to iommu on arm64 to improve IO performance Status in linux package in Ubuntu: Incomplete Bug description: [Impact] With SMMU page translation mode enabled. Several inefficiencies on ARM SMMUv3 driver has been identified. As a result IO BW is severely impacted. [Test] Running iperf on mellanox Connect-x4 cards capable of doing 40Gbps BW connected back to back we get only about 15Gbps. iperf -c 192.168.5.5 --bind 192.168.5.10 -P 5 -w 3.1M ------------------------------------------------------------ Client connecting to 192.168.5.5, TCP port 5001 Binding to local address 192.168.5.10 TCP window size: 6.20 MByte (WARNING: requested 3.10 MByte) ------------------------------------------------------------ [ 7] local 192.168.5.10 port 42588 connected with 192.168.5.5 port 5001 [ 3] local 192.168.5.10 port 42582 connected with 192.168.5.5 port 5001 [ 5] local 192.168.5.10 port 42584 connected with 192.168.5.5 port 5001 [ 4] local 192.168.5.10 port 42586 connected with 192.168.5.5 port 5001 [ 6] local 192.168.5.10 port 42590 connected with 192.168.5.5 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 6.50 GBytes 5.58 Gbits/sec [ 4] 0.0-10.0 sec 1.81 GBytes 1.56 Gbits/sec [ 6] 0.0-10.0 sec 6.08 GBytes 5.22 Gbits/sec [ 7] 0.0-10.0 sec 1.99 GBytes 1.71 Gbits/sec [ 5] 0.0-10.0 sec 2.00 GBytes 1.72 Gbits/sec [SUM] 0.0-10.0 sec 18.4 GBytes 15.8 Gbits/sec [Fix] After applying the patches listed below from linux-next, we were able to get throughputs of 35+ Gbps. These patches are currently in linux-next and in line for 4.13-rc. iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table iommu/arm-smmu-v3: Remove io-pgtable spinlock iommu/arm-smmu: Remove io-pgtable spinlock iommu/io-pgtable-arm-v7s: Support lockless operation iommu/io-pgtable-arm: Support lockless operation iommu/io-pgtable: Introduce explicit coherency iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap iommu/io-pgtable-arm: Improve split_blk_unmap iommu/io-pgtable-arm-v7s: Check table PTEs more precisely iommu/io-pgtable-arm-v7s: constify dummy_tlb_ops. iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it iommu/io-pgtable-arm-v7s: Add support for the IOMMU_PRIV flag iommu/io-pgtable-arm: Avoid shift overflow in block size iommu/io-pgtable-arm: Check for leaf entry before dereferencing it iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag iommu: add IOMMU_PRIV attribute [Regression Potential] These patches cherry-pick cleanly ontop of Zesty (4.10), and the patches are limited to ARM SMMU and pagetable fixes. The patches were tested on QDF2400 system with mlx connect-X4 40G cards. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1705123/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp