With the patches applied to Zesty and booting with iommu.passthrough=1
kernel parameter we are able to see 35+Gbps BW on iperf testing.

ubuntu@awsdp1:~$ iperf -s 192.168.5.5 -w 900M

ubuntu@awrep0:~$ iperf -c 192.168.5.5 -P 12 -w 900M
------------------------------------------------------------
Client connecting to 192.168.5.5, TCP port 5001
TCP window size: 1.76 GByte (WARNING: requested 900 MByte)
------------------------------------------------------------
[ 12] local 192.168.5.10 port 50216 connected with 192.168.5.5 port 5001
[ 4] local 192.168.5.10 port 50194 connected with 192.168.5.5 port 5001
[ 5] local 192.168.5.10 port 50196 connected with 192.168.5.5 port 5001
[ 7] local 192.168.5.10 port 50200 connected with 192.168.5.5 port 5001
[ 6] local 192.168.5.10 port 50198 connected with 192.168.5.5 port 5001
[ 8] local 192.168.5.10 port 50202 connected with 192.168.5.5 port 5001
[ 9] local 192.168.5.10 port 50204 connected with 192.168.5.5 port 5001
[ 3] local 192.168.5.10 port 50206 connected with 192.168.5.5 port 5001
[ 10] local 192.168.5.10 port 50208 connected with 192.168.5.5 port 5001
[ 11] local 192.168.5.10 port 50210 connected with 192.168.5.5 port 5001
[ 13] local 192.168.5.10 port 50212 connected with 192.168.5.5 port 5001
[ 14] local 192.168.5.10 port 50214 connected with 192.168.5.5 port 5001
[ ID] Interval Transfer Bandwidth
[ 12] 0.0-10.0 sec 4.16 GBytes 3.58 Gbits/sec
[ 4] 0.0-10.0 sec 2.72 GBytes 2.34 Gbits/sec
[ 5] 0.0-10.0 sec 3.55 GBytes 3.05 Gbits/sec
[ 7] 0.0-10.0 sec 3.97 GBytes 3.41 Gbits/sec
[ 6] 0.0-10.0 sec 4.09 GBytes 3.52 Gbits/sec
[ 8] 0.0-10.0 sec 3.61 GBytes 3.10 Gbits/sec
[ 9] 0.0-10.0 sec 3.48 GBytes 2.99 Gbits/sec
[ 3] 0.0-10.0 sec 3.43 GBytes 2.95 Gbits/sec
[ 10] 0.0-10.0 sec 2.92 GBytes 2.51 Gbits/sec
[ 11] 0.0-10.0 sec 4.06 GBytes 3.49 Gbits/sec
[ 14] 0.0-10.0 sec 4.08 GBytes 3.50 Gbits/sec
[ 13] 0.0-10.0 sec 2.71 GBytes 2.32 Gbits/sec
[SUM] 0.0-10.0 sec 42.8 GBytes 36.7 Gbits/sec

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1705123

Title:
  [Zesty] Fixes to iommu on arm64 to improve IO performance

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  [Impact]
  With SMMU page translation mode enabled. Several inefficiencies on ARM SMMUv3 
driver has been identified. As a result IO BW is severely impacted. 

  [Test]
  Running iperf on mellanox Connect-x4 cards capable of doing 40Gbps BW 
connected back to back we get only about 15Gbps.

  iperf -c 192.168.5.5 --bind 192.168.5.10 -P 5 -w 3.1M
  ------------------------------------------------------------
  Client connecting to 192.168.5.5, TCP port 5001
  Binding to local address 192.168.5.10
  TCP window size: 6.20 MByte (WARNING: requested 3.10 MByte)
  ------------------------------------------------------------
  [ 7] local 192.168.5.10 port 42588 connected with 192.168.5.5 port 5001
  [ 3] local 192.168.5.10 port 42582 connected with 192.168.5.5 port 5001
  [ 5] local 192.168.5.10 port 42584 connected with 192.168.5.5 port 5001
  [ 4] local 192.168.5.10 port 42586 connected with 192.168.5.5 port 5001
  [ 6] local 192.168.5.10 port 42590 connected with 192.168.5.5 port 5001
  [ ID] Interval Transfer Bandwidth
  [ 3] 0.0-10.0 sec 6.50 GBytes 5.58 Gbits/sec
  [ 4] 0.0-10.0 sec 1.81 GBytes 1.56 Gbits/sec
  [ 6] 0.0-10.0 sec 6.08 GBytes 5.22 Gbits/sec
  [ 7] 0.0-10.0 sec 1.99 GBytes 1.71 Gbits/sec
  [ 5] 0.0-10.0 sec 2.00 GBytes 1.72 Gbits/sec
  [SUM] 0.0-10.0 sec 18.4 GBytes 15.8 Gbits/sec

  [Fix]
  After applying the patches listed below from linux-next, we were able to get 
throughputs of 35+ Gbps. These patches are currently in linux-next and in line 
for 4.13-rc.

  iommu/io-pgtable-arm: Use dma_wmb() instead of wmb() when publishing table
  iommu/arm-smmu-v3: Remove io-pgtable spinlock
  iommu/arm-smmu: Remove io-pgtable spinlock
  iommu/io-pgtable-arm-v7s: Support lockless operation
  iommu/io-pgtable-arm: Support lockless operation
  iommu/io-pgtable: Introduce explicit coherency
  iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap
  iommu/io-pgtable-arm: Improve split_blk_unmap
  iommu/io-pgtable-arm-v7s: Check table PTEs more precisely
  iommu/io-pgtable-arm-v7s: constify dummy_tlb_ops.
  iommu/io-pgtable-arm-v7s: Check for leaf entry before dereferencing it
  iommu/io-pgtable-arm-v7s: Add support for the IOMMU_PRIV flag
  iommu/io-pgtable-arm: Avoid shift overflow in block size
  iommu/io-pgtable-arm: Check for leaf entry before dereferencing it
  iommu/io-pgtable-arm: add support for the IOMMU_PRIV flag
  iommu: add IOMMU_PRIV attribute

  [Regression Potential]
  These patches cherry-pick cleanly ontop of Zesty (4.10), and the patches are 
limited to ARM SMMU and pagetable fixes. The patches were tested on QDF2400 
system with mlx connect-X4 40G cards.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1705123/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to