Hi Qi,
On Thur, May 4, 2023 at 9:33PM, Zhang, Qi Z wrote:
-----Original Message-----
From: Morten Brørup <m...@smartsharesystems.com>
Sent: Thursday, May 4, 2023 9:22 PM
To: zhoumin <zhou...@loongson.cn>; Konstantin Ananyev
<konstantin.v.anan...@yandex.ru>
Cc: dev@dpdk.org; maob...@loongson.cn; Yang, Qiming
<qiming.y...@intel.com>; Wu, Wenjun1 <wenjun1...@intel.com>;
ruifeng.w...@arm.com; d...@linux.vnet.ibm.com; Tyler Retzlaff
<roret...@linux.microsoft.com>
Subject: RE: [PATCH v2] net/ixgbe: add proper memory barriers for some Rx
functions
From: zhoumin [mailto:zhou...@loongson.cn]
Sent: Thursday, 4 May 2023 15.17
Hi Konstantin,
Thanks for your comments.
On 2023/5/1 下午9:29, Konstantin Ananyev wrote:
Segmentation fault has been observed while running the
ixgbe_recv_pkts_lro() function to receive packets on the Loongson
3C5000 processor which has 64 cores and 4 NUMA nodes.
From the ixgbe_recv_pkts_lro() function, we found that as long as
the first packet has the EOP bit set, and the length of this packet
is less than or equal to rxq->crc_len, the segmentation fault will
definitely happen even though on the other platforms, such as X86.
Sorry to interrupt, but I am curious why this issue still exists on x86
architecture. Can volatile be used to instruct the compiler to generate read
instructions in a specific order, and does x86 guarantee not to reorder load
operations?
Actually, I did not see the segmentation fault on X86. I just made the
first packet which had the EOP bit set had a zero length, then the
segmentation fault would happen on X86. So, I thought that the
out-of-order access to the descriptor might be possible to make the
ready packet zero length, and this case was more likely to cause the
segmentation fault.
Because when processd the first packet the first_seg->next will be
NULL, if at the same time this packet has the EOP bit set and its
length is less than or equal to rxq->crc_len, the following loop
will be excecuted:
for (lp = first_seg; lp->next != rxm; lp = lp->next)
;
We know that the first_seg->next will be NULL under this condition.
So the
expression of lp->next->next will cause the segmentation fault.
Normally, the length of the first packet with EOP bit set will be
greater than rxq->crc_len. However, the out-of-order execution of
CPU may make the read ordering of the status and the rest of the
descriptor fields in this function not be correct. The related
codes are as following:
rxdp = &rx_ring[rx_id];
#1 staterr = rte_le_to_cpu_32(rxdp->wb.upper.status_error);
if (!(staterr & IXGBE_RXDADV_STAT_DD))
break;
#2 rxd = *rxdp;
The sentence #2 may be executed before sentence #1. This action is
likely to make the ready packet zero length. If the packet is the
first packet and has the EOP bit set, the above segmentation fault
will happen.
So, we should add rte_rmb() to ensure the read ordering be correct.
We also
did the same thing in the ixgbe_recv_pkts() function to make the
rxd data be valid even thougth we did not find segmentation fault
in this function.
Signed-off-by: Min Zhou <zhou...@loongson.cn>
---
v2:
- Make the calling of rte_rmb() for all platforms
---
drivers/net/ixgbe/ixgbe_rxtx.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c
b/drivers/net/ixgbe/ixgbe_rxtx.c index c9d6ca9efe..302a5ab7ff
100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1823,6 +1823,8 @@ ixgbe_recv_pkts(void *rx_queue, struct
rte_mbuf **rx_pkts,
staterr = rxdp->wb.upper.status_error;
if (!(staterr & rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD)))
break;
+
+ rte_rmb();
rxd = *rxdp;
Indeed, looks like a problem to me on systems with relaxed MO.
Strange that it was never hit on arm or ppc - cc-ing ARM/PPC maintainers.
The LoongArch architecture uses the Weak Consistency model which can
cause the problem, especially in scenario with many cores, such as
Loongson 3C5000 with four NUMA node, which has 64 cores. I cannot
reproduce it on Loongson 3C5000 with one NUMA node, which just has 16
cores.
About a fix - looks right, but a bit excessive to me - as I
understand all we need here is to prevent re-ordering by CPU itself.
Yes, thanks for cc-ing.
So rte_smp_rmb() seems enough here.
Or might be just:
staterr = __atomic_load_n(&rxdp->wb.upper.status_error,
__ATOMIC_ACQUIRE);
Does __atomic_load_n() work on Windows if we use it to solve this
problem ?
Yes, __atomic_load_n() works on Windows too.
Best regards,
Min