Hi Helin, On Mon, Jul 18, 2016 at 5:15 PM, Zhang, Helin <helin.zhang at intel.com> wrote: > Hi Ceara > > Could you help to let me know your firmware version?
# ethtool -i p7p1 | grep firmware firmware-version: f4.40.35115 a1.4 n4.53 e2021 > And could you help to try with the standard DPDK example application, such as > testpmd, to see if there is the same issue? > Basically we always set the same size for both rx and tx buffer, like the > default one of 2048 for a lot of applications. I'm a bit lost in the testpmd CLI. I enabled RSS, configured 2 RX queues per port and started sending traffic with single segmnet packets of size 2K but I didn't figure out how to actually verify that the RSS hash is correctly set.. Please let me know if I should do it in a different way. testpmd -c 0x331 -w 0000:82:00.0 -w 0000:83:00.0 -- --mbuf-size 2048 -i [...] testpmd> port stop all Stopping ports... Checking link statuses... Port 0 Link Up - speed 40000 Mbps - full-duplex Port 1 Link Up - speed 40000 Mbps - full-duplex Done testpmd> port config all txq 2 testpmd> port config all rss all testpmd> port config all max-pkt-len 2048 testpmd> port start all Configuring Port 0 (socket 0) PMD: i40e_set_tx_function_flag(): Vector tx can be enabled on this txq. PMD: i40e_set_tx_function_flag(): Vector tx can be enabled on this txq. PMD: i40e_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are satisfied. Rx Burst Bulk Alloc function will be used on port=0, queue=0. PMD: i40e_set_tx_function(): Vector tx finally be used. PMD: i40e_set_rx_function(): Using Vector Scattered Rx callback (port=0). Port 0: 3C:FD:FE:9D:BE:F0 Configuring Port 1 (socket 0) PMD: i40e_set_tx_function_flag(): Vector tx can be enabled on this txq. PMD: i40e_set_tx_function_flag(): Vector tx can be enabled on this txq. PMD: i40e_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are satisfied. Rx Burst Bulk Alloc function will be used on port=1, queue=0. PMD: i40e_set_tx_function(): Vector tx finally be used. PMD: i40e_set_rx_function(): Using Vector Scattered Rx callback (port=1). Port 1: 3C:FD:FE:9D:BF:30 Checking link statuses... Port 0 Link Up - speed 40000 Mbps - full-duplex Port 1 Link Up - speed 40000 Mbps - full-duplex Done testpmd> set txpkts 2048 testpmd> show config txpkts Number of segments: 1 Segment sizes: 2048 Split packet: off testpmd> start tx_first io packet forwarding - CRC stripping disabled - packets/burst=32 nb forwarding cores=1 - nb forwarding ports=2 RX queues=1 - RX desc=128 - RX free threshold=32 RX threshold registers: pthresh=8 hthresh=8 wthresh=0 TX queues=2 - TX desc=512 - TX free threshold=32 TX threshold registers: pthresh=32 hthresh=0 wthresh=0 TX RS bit threshold=32 - TXQ flags=0xf01 testpmd> stop Telling cores to stop... Waiting for lcores to finish... ---------------------- Forward statistics for port 0 ---------------------- RX-packets: 32 RX-dropped: 0 RX-total: 32 TX-packets: 32 TX-dropped: 0 TX-total: 32 ---------------------------------------------------------------------------- ---------------------- Forward statistics for port 1 ---------------------- RX-packets: 32 RX-dropped: 0 RX-total: 32 TX-packets: 32 TX-dropped: 0 TX-total: 32 ---------------------------------------------------------------------------- +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ RX-packets: 64 RX-dropped: 0 RX-total: 64 TX-packets: 64 TX-dropped: 0 TX-total: 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Done. testpmd> > > Definitely we will try to reproduce that issue with testpmd, with using 2K > mbufs. Hopefully we can find the root cause, or tell you that's not an issue. > I forgot to mention that in my test code the TX/RX_MBUF_SIZE macros also include the mbuf headroom and the size of the mbuf structure. Therefore testing with 2K mbufs in my scenario actually creates mempools of objects of size 2K + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM. > Thank you very much for your reporting! > > BTW, dev at dpdk.org should be the right one to replace users at dpdk.org, > for sending questions/issues like this. Thanks, I'll keep that in mind. > > Regards, > Helin Regards, Dumitru > >> -----Original Message----- >> From: Take Ceara [mailto:dumitru.ceara at gmail.com] >> Sent: Monday, July 18, 2016 4:03 PM >> To: users at dpdk.org >> Cc: Zhang, Helin <helin.zhang at intel.com>; Wu, Jingjing <jingjing.wu at >> intel.com> >> Subject: [dpdk-users] RSS Hash not working for XL710/X710 NICs for some RX >> mbuf sizes >> >> Hi, >> >> Is there any known issue regarding the i40e DPDK driver when having RSS >> hashing enabled in DPDK 16.04? >> I've noticed that for some specific receive mbuf sizes the RSS hash is >> always set >> to 0 for incoming packets. >> >> I have a setup with two XL710 ports connected back to back. The simple test >> program below sends fixed TCP packets from port 0 to port 1. The >> L5 payload is added in the packet in such a way that the packet consumes >> exactly >> one TX mbuf. For some values of the RX mbuf size the incoming mbuf has the >> hash.rss == 0 even though the PKT_RX_RSS_HASH flag is set in ol_flags. In my >> code the TX/RX mbuf sizes are controlled by the RX_MBUF_SIZE and >> TX_MBUF_SIZE macros. >> >> As an example, with some of the following TX/RX sizes the assert that checks >> if >> the RSS hash is non-zero fails and with the other it passes: >> >> RX_MBUF_SIZE TX_MBUF_SIZE assert >> ================================= >> 1024 1024 fail >> 1025 1024 ok >> 1024 2048 fail >> 2048 2048 fail >> 2048 2047 fail >> 2049 2048 ok >> >> On the same setup I have another loopback connection between two 82599ES >> 10G NICs and when I run exactly the same test the RSS hash is always correct >> in >> all cases. >> >> $ $RTE_SDK/tools/dpdk_nic_bind.py -s >> >> Network devices using DPDK-compatible driver >> ============================================ >> 0000:02:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' >> drv=igb_uio unused= >> 0000:03:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection' >> drv=igb_uio unused= >> 0000:82:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' drv=igb_uio unused= >> 0000:83:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' drv=igb_uio unused= >> >> The command line I use for running the test on the 40G NICs is: >> >> ./build/test -c 0x1 -n 4 -m 1024 -w 0000:82:00.0 -w 0000:83:00.0 >> >> Thanks, >> Dumitru Ceara >> >> #include <stdbool.h> >> #include <stdint.h> >> #include <assert.h> >> #include <unistd.h> >> >> #include <rte_ethdev.h> >> #include <rte_timer.h> >> #include <rte_ip.h> >> #include <rte_tcp.h> >> #include <rte_udp.h> >> #include <rte_errno.h> >> #include <rte_arp.h> >> >> #define MBUF_SIZE(frag_size) \ >> ((frag_size) + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) >> >> #define RX_MBUF_SIZE MBUF_SIZE(RTE_MBUF_DEFAULT_DATAROOM) >> #define TX_MBUF_SIZE MBUF_SIZE(RTE_MBUF_DEFAULT_DATAROOM) >> >> #define MBUF_CACHE 512 >> #define MBUF_COUNT 1024 >> >> static struct rte_mempool *rx_mpool; >> static struct rte_mempool *tx_mpool; >> >> #define PORT_MAX_MTU 9198 >> >> #define L5_GET_LEN(pkt) (rte_pktmbuf_tailroom((pkt))) >> >> #define PORT0 0 >> #define PORT1 1 >> #define QUEUE0 0 >> #define Q_CNT 1 >> >> >> struct rte_eth_conf default_port_config = { >> .rxmode = { >> .mq_mode = ETH_MQ_RX_RSS, >> .max_rx_pkt_len = PORT_MAX_MTU, >> .split_hdr_size = 0, >> .header_split = 0, /**< Header Split disabled */ >> .hw_ip_checksum = 1, /**< IP checksum offload enabled */ >> .hw_vlan_filter = 0, /**< VLAN filtering disabled */ >> .jumbo_frame = 1, /**< Jumbo Frame Support disabled */ >> .hw_strip_crc = 0, /**< CRC stripped by hardware */ >> }, >> .rx_adv_conf = { >> .rss_conf = { >> .rss_key = NULL, >> .rss_key_len = 0, >> .rss_hf = ETH_RSS_IPV4 | ETH_RSS_NONFRAG_IPV4_TCP | >> ETH_RSS_NONFRAG_IPV4_UDP, >> }, >> }, >> .txmode = { >> .mq_mode = ETH_MQ_TX_NONE, >> } >> }; >> >> struct rte_eth_rxconf rx_conf = { >> .rx_thresh = { >> .pthresh = 8, >> .hthresh = 8, >> .wthresh = 4, >> }, >> .rx_free_thresh = 64, >> .rx_drop_en = 0 >> }; >> >> struct rte_eth_txconf tx_conf = { >> .tx_thresh = { >> .pthresh = 36, >> .hthresh = 0, >> .wthresh = 0, >> }, >> .tx_free_thresh = 64, >> .tx_rs_thresh = 32, >> }; >> >> static void port_setup(uint32_t port) >> { >> uint32_t queue; >> int ret; >> >> assert(rte_eth_dev_configure(port, Q_CNT, Q_CNT, >> &default_port_config) == 0); >> for (queue = 0; queue < Q_CNT; queue++) { >> ret = rte_eth_rx_queue_setup(port, queue, 128, SOCKET_ID_ANY, >> &rx_conf, >> rx_mpool); >> assert(ret == 0); >> ret = rte_eth_tx_queue_setup(port, queue, 128, SOCKET_ID_ANY, >> &tx_conf); >> assert(ret == 0); >> } >> >> assert(rte_eth_dev_start(port) == 0); } >> >> #define HDRS_SIZE \ >> (sizeof(struct ether_hdr) + \ >> sizeof(struct ipv4_hdr) + \ >> sizeof(struct tcp_hdr)) >> >> static struct rte_mbuf *get_tcp_pkt(uint16_t eth_port) { >> struct rte_mbuf *pkt; >> struct ether_hdr *eth_hdr; >> struct ipv4_hdr *ip_hdr; >> struct tcp_hdr *tcp_hdr; >> uint32_t ip_hdr_len = sizeof(*ip_hdr); >> uint32_t tcp_hdr_len = sizeof(*tcp_hdr); >> uint32_t l5_len; >> >> assert(pkt = rte_pktmbuf_alloc(tx_mpool)); >> >> pkt->port = eth_port; >> pkt->l2_len = sizeof(*eth_hdr); >> >> RTE_LOG(ERR, USER1, "1:head = %d, tail = %d, len = %d\n", >> rte_pktmbuf_headroom(pkt), rte_pktmbuf_tailroom(pkt), >> rte_pktmbuf_pkt_len(pkt)); >> >> /* Reserve space for ETH + IP + TCP Headers. >> * Store how much tailroom we have. >> */ >> eth_hdr = (struct ether_hdr *)rte_pktmbuf_append(pkt, HDRS_SIZE); >> assert(eth_hdr); >> l5_len = L5_GET_LEN(pkt); >> >> /* ETH Header. */ >> rte_eth_macaddr_get(PORT0, ð_hdr->s_addr); >> rte_eth_macaddr_get(PORT1, ð_hdr->d_addr); >> eth_hdr->ether_type = rte_cpu_to_be_16(ETHER_TYPE_IPv4); >> >> /* IP Header. */ >> ip_hdr = (struct ipv4_hdr *)(eth_hdr + 1); >> ip_hdr->version_ihl = (4 << 4) | (ip_hdr_len >> 2); >> ip_hdr->type_of_service = 0; >> ip_hdr->total_length = rte_cpu_to_be_16(ip_hdr_len + tcp_hdr_len + >> l5_len); >> ip_hdr->packet_id = 0; >> ip_hdr->fragment_offset = rte_cpu_to_be_16(0); >> ip_hdr->time_to_live = 60; >> ip_hdr->next_proto_id = IPPROTO_TCP; >> ip_hdr->src_addr = rte_cpu_to_be_32(0x01010101); >> ip_hdr->dst_addr = rte_cpu_to_be_32(0x01010101); >> ip_hdr->hdr_checksum = rte_cpu_to_be_16(0); >> >> pkt->l3_len = ip_hdr_len; >> pkt->ol_flags |= PKT_TX_IP_CKSUM; >> >> /* TCP Header. */ >> tcp_hdr = (struct tcp_hdr *)(ip_hdr + 1); >> tcp_hdr->src_port = rte_cpu_to_be_16(0x42); >> tcp_hdr->dst_port = rte_cpu_to_be_16(0x24); >> tcp_hdr->sent_seq = rte_cpu_to_be_32(0x1234); >> tcp_hdr->recv_ack = rte_cpu_to_be_32(0x1234); >> tcp_hdr->data_off = tcp_hdr_len >> 2 << 4; >> tcp_hdr->tcp_flags = TCP_FIN_FLAG; >> tcp_hdr->rx_win = rte_cpu_to_be_16(0xffff); >> tcp_hdr->tcp_urp = rte_cpu_to_be_16(0); >> >> pkt->ol_flags |= PKT_TX_TCP_CKSUM | PKT_TX_IPV4; >> pkt->l4_len = tcp_hdr_len; >> >> tcp_hdr->cksum = 0; >> tcp_hdr->cksum = rte_ipv4_phdr_cksum(ip_hdr, pkt->ol_flags); >> >> /* Add Payload. */ >> assert(rte_pktmbuf_append(pkt, l5_len)); >> >> RTE_LOG(ERR, USER1, "1:head = %d, tail = %d, len = %d\n", >> rte_pktmbuf_headroom(pkt), rte_pktmbuf_tailroom(pkt), >> rte_pktmbuf_pkt_len(pkt)); >> >> return pkt; >> } >> >> int main(int argc, char **argv) >> { >> struct rte_mbuf *tx_mbuf[3]; >> >> rte_eal_init(argc, argv); >> >> rx_mpool = rte_mempool_create("rx_mpool", MBUF_COUNT, >> RX_MBUF_SIZE, >> 0, >> sizeof(struct rte_pktmbuf_pool_private), >> rte_pktmbuf_pool_init, NULL, >> rte_pktmbuf_init, NULL, >> SOCKET_ID_ANY, >> 0); >> >> tx_mpool = rte_mempool_create("tx_mpool", MBUF_COUNT, >> TX_MBUF_SIZE, >> 0, >> sizeof(struct rte_pktmbuf_pool_private), >> rte_pktmbuf_pool_init, NULL, >> rte_pktmbuf_init, NULL, >> SOCKET_ID_ANY, >> 0); >> >> assert(rx_mpool && tx_mpool); >> >> port_setup(PORT0); >> port_setup(PORT1); >> >> for (;;) { >> uint16_t no_rx_buffers; >> uint16_t i; >> struct rte_mbuf *rx_pkts[16]; >> >> tx_mbuf[0] = get_tcp_pkt(PORT0); >> assert(rte_eth_tx_burst(PORT0, QUEUE0, tx_mbuf, 1) == 1); >> >> no_rx_buffers = rte_eth_rx_burst(PORT1, QUEUE0, rx_pkts, 16); >> for (i = 0; i < no_rx_buffers; i++) { >> RTE_LOG(ERR, USER1, "RX RSS HASH: %8lX %4X\n", >> rx_pkts[i]->ol_flags, >> rx_pkts[i]->hash.rss); >> >> assert(rx_pkts[i]->ol_flags == PKT_RX_RSS_HASH); >> assert(rx_pkts[i]->hash.rss != 0); >> >> rte_pktmbuf_free(rx_pkts[i]); >> } >> } >> >> return 0; >> } -- Dumitru Ceara