On 7/4/2025 9:18 AM, Michal Kubiak wrote: > This series modernizes the Rx path in the ice driver by removing legacy > code and switching to the Page Pool API. The changes follow the same > direction as previously done for the iavf driver, and aim to simplify > buffer management, improve maintainability, and prepare for future > infrastructure reuse. > > An important motivation for this work was addressing reports of poor > performance in XDP_TX mode when IOMMU is enabled. The legacy Rx model > incurred significant overhead due to per-frame DMA mapping, which > limited throughput in virtualized environments. This series eliminates > those bottlenecks by adopting Page Pool and bi-directional DMA mapping. > > The first patch removes the legacy Rx path, which relied on manual skb > allocation and header copying. This path has become obsolete due to the > availability of build_skb() and the increasing complexity of supporting > features like XDP and multi-buffer. > > The second patch drops the page splitting and recycling logic. While > once used to optimize memory usage, this logic introduced significant > complexity and hotpath overhead. Removing it simplifies the Rx flow and > sets the stage for Page Pool adoption. > > The final patch switches the driver to use the Page Pool and libeth > APIs. It also updates the XDP implementation to use libeth_xdp helpers > and optimizes XDP_TX by avoiding per-frame DMA mapping. This results in > a significant performance improvement in virtualized environments with > IOMMU enabled (over 5x gain in XDP_TX throughput). In other scenarios, > performance remains on par with the previous implementation. > > This conversion also aligns with the broader effort to modularize and > unify XDP support across Intel Ethernet drivers. > > Tested on various workloads including netperf and XDP modes (PASS, DROP, > TX) with and without IOMMU. No regressions observed. > > Last but not least, it is suspected that this series may also help > mitigate the memory consumption issues recently reported in the driver. > For further details, see: > > https://lore.kernel.org/intel-wired-lan/cak8ffz4hy6gujnenz3wy9jaylzxgfpr7dnzxzgmyoe44car...@mail.gmail.com/ >
I tried to apply these and test them, but I ran into several issues :( The iperf3 session starts with some traffic and then very quickly dies to zero: > [ 5] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec > [ 8] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec > [ 10] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec > [ 12] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec > [ 14] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec > [SUM] 4.00-5.00 sec 0.00 Bytes 0.00 bits/sec > - - - - - - - - - - - - - - - - - - - - - - - - - > [ 5] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec > [ 8] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec > [ 10] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec > [ 12] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec > [ 14] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec > [SUM] 5.00-6.00 sec 0.00 Bytes 0.00 bits/sec > - - - - - - - - - - - - - - - - - - - - - - - - - > [ 5] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec > [ 8] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec > [ 10] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec > [ 12] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec > [ 14] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec > [SUM] 6.00-7.00 sec 0.00 Bytes 0.00 bits/sec > - - - - - - - - - - - - - - - - - - - - - - - - - > [ 5] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec > [ 8] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec > [ 10] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec > [ 12] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec > [ 14] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec > [SUM] 7.00-8.00 sec 0.00 Bytes 0.00 bits/sec > - - - - - - - - - - - - - - - - - - - - - - - - - > [ 5] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec > [ 8] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec > [ 10] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec > [ 12] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec > [ 14] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec > [SUM] 8.00-9.00 sec 0.00 Bytes 0.00 bits/sec > - - - - - - - - - - - - - - - - - - - - - - - - - > [ 5] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec > [ 8] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec > [ 10] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec > [ 12] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec > [ 14] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec > [SUM] 9.00-10.00 sec 0.00 Bytes 0.00 bits/sec I eventually got a crash: > jekeller-stp-glorfindel login: [ 326.338776] ------------[ cut here > ]------------ > [ 326.343440] WARNING: CPU: 109 PID: 0 at > include/net/page_pool/helpers.h:297 libeth_rx_recycle_slow+0x2f/0x4f [libeth] > [ 326.354082] Modules linked in: ice gnss libeth_xdp libeth cfg80211 rfkill > nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 > nf_reject_ipv6 nft_reject nft_ct nft_chain_nat ebtable_nat ebtable_broute > ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat > nf_nat nf_conntr > ack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security > nf_tables ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter > ip_tables qrtr intel_rapl_msr intel_rapl_common intel_uncore_frequency > intel_uncore_frequency_common sunrpc skx_edac skx_edac_common nfit libnvdimm > x86_pkg_temp_th > ermal intel_powerclamp coretemp kvm_intel spi_nor mtd kvm irqbypass iTCO_wdt > rapl intel_pmc_bxt ipmi_ssif mei_me iTCO_vendor_support intel_cstate vfat fat > i40e spi_intel_pci intel_uncore i2c_i801 pcspkr libie ioatdma mei > libie_adminq lpc_ich i2c_smbus spi_intel intel_pch_thermal dca ipmi_si > acpi_power_meter acpi_ipm > i ipmi_devintf ipmi_msghandler acpi_pad fuse loop dm_multipath nfnetlink zram > [ 326.354222] lz4hc_compress lz4_compress xfs qat_c62x intel_qat > polyval_clmulni ghash_clmulni_intel sha512_ssse3 sha1_ssse3 ast crc8 > i2c_algo_bit wmi scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkcs8_key_parser tls > [ 326.462156] CPU: 109 UID: 0 PID: 0 Comm: swapper/109 Not tainted > 6.16.0-rc4-ice-page-pool+ #25 PREEMPT(lazy) > [ 326.472075] Hardware name: Intel Corporation S2600STQ/S2600STQ, BIOS > SE5C620.86B.02.01.0017.110620230543 11/06/2023 > [ 326.482519] RIP: 0010:libeth_rx_recycle_slow+0x2f/0x4f [libeth] > [ 326.488454] Code: 1f 44 00 00 48 89 f8 48 89 fe 48 83 e0 fe 48 8b 50 28 48 > 8b 78 10 48 ff ca 74 20 48 83 ca ff f0 48 0f c1 50 28 48 ff ca 79 07 <0f> 0b > c3 cc cc cc cc 75 12 48 c7 40 28 01 00 00 00 31 c9 83 ca ff > [ 326.507232] RSP: 0018:ffffd2c4c814cd38 EFLAGS: 00010296 > [ 326.512466] RAX: fffff58c342d0ec0 RBX: 0000000000000000 RCX: > 00000000000000e3 > [ 326.519608] RDX: ffffffffffffffff RSI: fffff58c342d0ec0 RDI: > ffff8d596e024100 > [ 326.527173] RBP: ffffd2c4c814cdf8 R08: ffffd2c4e6bd3960 R09: > 0000000000000000 > [ 326.534674] R10: 00000000fffffb54 R11: 000000000002cd86 R12: > ffff8d49fde71cb0 > [ 326.542159] R13: 00000000000001cb R14: ffff8d49acca5600 R15: > ffffd2c4e6bd3960 > [ 326.549627] FS: 0000000000000000(0000) GS:ffff8d59a3c9b000(0000) > knlGS:0000000000000000 > [ 326.558047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 326.564119] CR2: 00007f3eda90df78 CR3: 0000000caee56001 CR4: > 00000000007726f0 > [ 326.571574] PKRU: 55555554 > [ 326.574595] Call Trace: > [ 326.577353] <IRQ> > [ 326.579664] ice_clean_rx_irq+0x431/0x520 [ice] > [ 326.584584] ? iommu_dma_unmap_page+0x48/0x90 > [ 326.589232] ice_napi_poll+0xbe/0x2a0 [ice] > [ 326.593786] __napi_poll+0x2e/0x1e0 > [ 326.597567] net_rx_action+0x336/0x420 > [ 326.601608] ? update_rq_clock_task+0x3f/0x1d0 > [ 326.606344] ? sched_clock+0x10/0x30 > [ 326.610207] handle_softirqs+0xed/0x340 > [ 326.614316] __irq_exit_rcu+0xcb/0xf0 > [ 326.618241] common_interrupt+0x85/0xa0 > [ 326.622340] </IRQ> > [ 326.624702] <TASK> > [ 326.627053] asm_common_interrupt+0x26/0x40 > [ 326.631493] RIP: 0010:cpuidle_enter_state+0xcc/0x660 > [ 326.636709] Code: 00 00 e8 67 40 ed fe e8 32 f0 ff ff 49 89 c4 0f 1f 44 00 > 00 31 ff e8 53 54 eb fe 45 84 ff 0f 85 02 02 00 00 fb 0f 1f 44 00 00 <85> ed > 0f 88 d3 01 00 00 4c 63 f5 49 83 fe 0a 0f 83 9f 04 00 00 49 > [ 326.655959] RSP: 0018:ffffd2c4c6aefe50 EFLAGS: 00000246 > [ 326.661446] RAX: ffff8d59a3c9b000 RBX: ffff8d592decfe80 RCX: > 0000000000000000 > [ 326.668863] RDX: 0000004bfb4d51d2 RSI: 000000003351fed6 RDI: > 0000000000000000 > [ 326.676284] RBP: 0000000000000002 R08: ffffffbe2deca6d0 R09: > ffff8d592deb0660 > [ 326.683706] R10: 0000008df1fafa1d R11: 0000000000000000 R12: > 0000004bfb4d51d2 > [ 326.691133] R13: ffffffff89512ee0 R14: 0000000000000002 R15: > 0000000000000000 > [ 326.698560] cpuidle_enter+0x31/0x50 > [ 326.702387] cpuidle_idle_call+0xf5/0x160 > [ 326.706647] do_idle+0x78/0xd0 > [ 326.709937] cpu_startup_entry+0x29/0x30 > [ 326.714087] start_secondary+0x126/0x170 > [ 326.718241] common_startup_64+0x13e/0x141 > [ 326.722561] </TASK> > [ 326.724960] ---[ end trace 0000000000000000 ]--- Something has gone wrong with the patches applied :( > Thanks, > Michal > > Michal Kubiak (3): > ice: remove legacy Rx and construct SKB > ice: drop page splitting and recycling > ice: switch to Page Pool > > drivers/net/ethernet/intel/Kconfig | 1 + > drivers/net/ethernet/intel/ice/ice.h | 3 +- > drivers/net/ethernet/intel/ice/ice_base.c | 122 ++-- > drivers/net/ethernet/intel/ice/ice_ethtool.c | 22 +- > drivers/net/ethernet/intel/ice/ice_lib.c | 1 - > drivers/net/ethernet/intel/ice/ice_main.c | 21 +- > drivers/net/ethernet/intel/ice/ice_txrx.c | 645 +++--------------- > drivers/net/ethernet/intel/ice/ice_txrx.h | 37 +- > drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 65 +- > drivers/net/ethernet/intel/ice/ice_txrx_lib.h | 7 +- > drivers/net/ethernet/intel/ice/ice_virtchnl.c | 5 +- > drivers/net/ethernet/intel/ice/ice_xsk.c | 120 +--- > drivers/net/ethernet/intel/ice/ice_xsk.h | 6 +- > 13 files changed, 205 insertions(+), 850 deletions(-) >
OpenPGP_signature.asc
Description: OpenPGP digital signature
