[vpp-dev] optimal buffer configuration on Azure

Peter Morrow Wed, 29 Jun 2022 06:12:48 -0700

Hello,

In this example i've got a 4 vCPU Azure VM with 16G of RAM, 2G of that is given 
to 1024 2MB huge pages:


$ cat /proc/meminfo  | grep -i huge
AnonHugePages:     71680 kB
ShmemHugePages:        0 kB
FileHugePages:         0 kB
HugePages_Total:    1024
HugePages_Free:        1
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:         2097152 kB
$

There are 2 interfaces which are vpp owned and which are both using the netvsc 
pmd:

$ sudo vppctl sh hard
              Name                Idx   Link  Hardware
GigabitEthernet1                   1     up   GigabitEthernet1
  Link speed: 50 Gbps
  RX Queues:
    queue thread         mode
    0     vpp_wk_0 (1)   polling
    1     vpp_wk_1 (2)   polling
  Ethernet address 60:45:bd:85:22:97
  Microsoft Hyper-V Netvsc
    carrier up full duplex max-frame-size 0
    flags: tx-offload rx-ip4-cksum
    Devargs:
    rx: queues 2 (max 64), desc 1024 (min 0 max 65535 align 1)
    tx: queues 2 (max 64), desc 1024 (min 1 max 4096 align 1)
    max rx packet len: 65536
    promiscuous: unicast off all-multicast off
    vlan offload: strip off filter off qinq off
    rx offload avail:  vlan-strip ipv4-cksum udp-cksum tcp-cksum rss-hash
    rx offload active: ipv4-cksum
    tx offload avail:  vlan-insert ipv4-cksum udp-cksum tcp-cksum tcp-tso
                       multi-segs
    tx offload active: ipv4-cksum udp-cksum tcp-cksum multi-segs
    rss avail:         ipv4-tcp ipv4-udp ipv4 ipv6-tcp ipv6
    rss active:        ipv4-tcp ipv4 ipv6-tcp ipv6
    tx burst function: (not available)
    rx burst function: (not available)

GigabitEthernet2                   2     up   GigabitEthernet2
  Link speed: 50 Gbps
  RX Queues:
    queue thread         mode
    0     vpp_wk_2 (3)   polling
    1     vpp_wk_0 (1)   polling
  Ethernet address 60:45:bd:85:23:94
  Microsoft Hyper-V Netvsc
    carrier up full duplex max-frame-size 0
    flags: tx-offload rx-ip4-cksum
    Devargs:
    rx: queues 2 (max 64), desc 1024 (min 0 max 65535 align 1)
    tx: queues 2 (max 64), desc 1024 (min 1 max 4096 align 1)
    max rx packet len: 65536
    promiscuous: unicast off all-multicast off
    vlan offload: strip off filter off qinq off
    rx offload avail:  vlan-strip ipv4-cksum udp-cksum tcp-cksum rss-hash
    rx offload active: ipv4-cksum
    tx offload avail:  vlan-insert ipv4-cksum udp-cksum tcp-cksum tcp-tso
                       multi-segs
    tx offload active: ipv4-cksum udp-cksum tcp-cksum multi-segs
    rss avail:         ipv4-tcp ipv4-udp ipv4 ipv6-tcp ipv6
    rss active:        ipv4-tcp ipv4 ipv6-tcp ipv6
    tx burst function: (not available)
    rx burst function: (not available)

local0                             0    down  local0
  Link speed: unknown
  local
$

Config file looks like this:

unix {
    nodaemon
    log /var/log/vpp/vpp.log
    full-coredump
    cli-listen /run/vpp/cli.sock
    gid vpp
}
api-trace {
    on
}
api-segment {
    gid vpp
}
socksvr {
    socket-name /run/vpp/api.sock
}
plugins {
    # Common plugins.
    plugin default { disable }
    plugin dpdk_plugin.so { enable }
    plugin linux_cp_plugin.so { enable }
    plugin crypto_native_plugin.so { enable }
   < -- snip lots of plugins -- >
}
dpdk {

    # VMBUS UUID.
    dev 6045bd85-2297-6045-bd85-22976045bd85 {
        num-rx-queues 4
        num-tx-queues 4
        name GigabitEthernet1
    }

    # VMBUS UUID.
    dev 6045bd85-2394-6045-bd85-23946045bd85 {
        num-rx-queues 4
        num-tx-queues 4
        name GigabitEthernet2
    }

}

cpu {
    skip-cores 0
    main-core 0
    corelist-workers 1-3
}
buffers {
    # Max buffers based on data size & huge page configuration.
    buffers-per-numa 853440
    default data-size 2048
    page-size default-hugepage
}

statseg {
    size 128M
}

My issue is that I start to see errors from the mlnx5 driver when using a large 
number of buffers:

2022/06/29 12:44:11:427 notice     dpdk           common_mlx5: Unable to find 
virtually contiguous chunk for address (0x1000000000). rte_memseg_contig_walk() 
failed.
2022/06/29 12:44:11:427 notice     dpdk           common_mlx5: Unable to find 
virtually contiguous chunk for address (0x103fe00000). rte_memseg_contig_walk() 
failed.
2022/06/29 12:44:11:427 notice     dpdk           common_mlx5: Unable to find 
virtually contiguous chunk for address (0x1040000000). rte_memseg_contig_walk() 
failed.
2022/06/29 12:44:11:427 notice     dpdk           common_mlx5: Unable to find 
virtually contiguous chunk for address (0x1040200000). rte_memseg_contig_walk() 
failed.

The spew continues.

With a smaller number of buffers I don't see this problem and there are no 
issues with the packet forwarding side of things. I'm not sure what the buffer 
limit is before things
go bad.

I read the excellent description of how buffer sizes are calculated here: 
https://lists.fd.io/g/vpp-dev/topic/buffer_occupancy_calculation/76605334?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,20,76605334
 and as a result though it would be a good idea to allocate
buffers based on buffer size and the available of memory in huge pages, however 
when the buffer size is large "enough" the
common_mlx5 errors start to spew. I don't see this issue on other platforms 
where I am able to max out buffers based on huge page allocation.

I was pointed towards 
https://doc.dpdk.org/guides/platform/mlx5.html#mlx5-common-driver-options and 
mr_ext_memseg_en which would supress
this notice. However I can only pass dpdk eal options to the netvsc pmd and not 
the mlx5,  so this does not seem to be an option.

Ideally what I would like to do is max out the number of buffers based on 
available hugepage memory since on some setups, if there is a cap
on mappable buffers allowed for this specific device (mlx5) then I could cap to 
that number instead of using max buffers based on huge page availability.

Thanks,
Peter.

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#21593): https://lists.fd.io/g/vpp-dev/message/21593
Mute This Topic: https://lists.fd.io/mt/92064311/21656
Group Owner: [email protected]
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] optimal buffer configuration on Azure

Reply via email to