[dpdk-dev] [PATCH] i40e: Use one bit flag for all hardware detected RX packet errors

Zhang, Helin Tue, 2 Dec 2014 07:25:59 +0000


> -----Original Message-----
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Monday, December 1, 2014 5:58 PM
> To: Zhang, Helin; Ananyev, Konstantin; dev at dpdk.org
> Cc: Cao, Waterman; Cao, Min
> Subject: Re: [PATCH] i40e: Use one bit flag for all hardware detected RX 
> packet
> errors
> 
> Hi Helin,
> 
> On 12/01/2014 02:57 AM, Zhang, Helin wrote:
> >>> #define PKT_RX_EIP_CKSUM_BAD (0ULL << 0)  /**< External IP header
> >>> checksum error. */ [Helin] Nobody complains it, so we will keep it
> >>> there, and
> >> just assign a new value to it.
> >>
> >> ok.
> >>
> >> But it would be nice to have a better definition of this flag: does
> >> external mean outer header? For instance, when you receive a
> >> Ether/IP1/UDP/vxlan/Ether/IP2/xxx, does the flag concerns IP1 or IP2?
> > 'E' means 'external', it indicates the (most) outer IP header checksum
> > error. If you don't think this name is not so clear, I can change it to
> 'PKT_RX_OUTER_IP_CHSUM_BAD'.
> > For inner IP header checksum error, it will be indicated by
> PKT_RX_IP_CKSUM_BAD.
> >
> >>
> >> If it's IP1, it's really strange compared to the current behavior
> >> (the flag PKT_RX_IP_CKSUM_BAD refers to IP1).
> 
> Ok.
> But the real sense of my question was about the behavior which seems different
> than with previous hardware. Today, if you receive the packet
> Ether/IP1/UDP/vxlan/Ether/IP2/xxx on an ixgbe, the flag
> PKT_RX_IP_CKSUM_BAD can be set if the checksum of IP1 is wrong. From your
> explanation, I understand that PKT_RX_EIP_CKSUM_BAD would be set for the
> same thing on i40e. Is it correct?
Yes, it is strange if ixgbe hardware checksum logic knows the packet is 
tunneling.
But, from another point of view, which seems the the ixgbe hardware does, if
checksum logic does not know tunneling, it may treat all others as data. This is
reasonable to report the error with PKT_RX_IP_CKSUM_BAD.


> 
> 
> >>> #define PKT_RX_OVERSIZE      (0ULL << 0)  /**< Num of desc of an RX
> pkt
> >> oversize. */
> >>> [Helin] I don't think it can be merge with other hardware errors. It
> >>> indicates the packet received needs more descriptors than hardware
> >>> allowed, and the part of packets can still be stored in the mbufs
> >>> provided. It is a good hint for users that larger size of mbuf might
> >>> be needed. If just put it as hardware error, users will lose this
> >>> information. So I
> >> prefer to keep it there, and just assign a new value to it.
> >>
> >> Again, a statistic counter would do the job which if it's just to
> >> provide a hint to the application.
> > It seems that we do not maintain a counter for packets in PMD, if I am
> > not wrong. Two ways current DPDK is using.
> > One is hardware provide registers to do that, we can read it directly when
> needed.
> > The other one is that applications or middle layer sw maintain its own 
> > statistics.
> 
> rte_eth_stats_get() gives the generic statistics For specific error stats,
> rte_eth_xstats_get() can be used from an application (the driver has to 
> provide
> the full list of statistics).
Yes, that function read all the statistics from registers directly. I think 
there might
already have the corresponding registers for it.

> 
> >> I wonder in which case this flag can happen. If you fill the ring
> >> with mbufs that are large enough compared to your ethernet network,
> >> this should not happen in normal conditions. I really don't believe
> >> that an application receiving an mbuf with this flag would stop the 
> >> driver, then
> refill the rings it with larger mbufs.
> > This is not because of it is lack of available RX descriptors. It is
> > because of a hardware requirement. FVL hardware requires that it
> > should not use more than 5 rx descriptors for receiving a single packet.
> 
> I still don't understand what the application should do when the flag is set.
> Maybe you could provide an example in l2fwd or testpmd?
For an application, if it is reported with this oversize error, it then easily 
knows that
the mbuf size might not be enough. If it is reported with just a hardware 
error, then
the application developer needs to dig into the PMD to see what happens. I 
think most
of application developers do not want to debug PMD to much, as it might not be 
their scope.

> 
> >> Last but not least: If it's really useful, should we have the same
> >> behavior on other drivers like ixgbe? I think we really need to care
> >> about not having different ways to use the different drivers.
> > I don't see the similar bit in ixgbe datasheet, but this restriction
> > could be common for some other NICs, as it is reasonable from hardware
> perspective.
> 
> In ixgbe, there are other error cases:
> - frames shorter than 64 bytes
> - oversize (frames larger than MAXFRS)
> - ... maybe others?
> 
> Should we have a flag for each situation? I think not.
The more the better, but we may need a tradeoff, as the flag bits is limited.

> 
> 
> >> To me, the only argument in favor of keeping this flag is when the
> >> mbuf contains a part of the data that could be dumped by a user for debug
> purposes.
> >>
> >>> #define PKT_RX_HBUF_OVERFLOW (0ULL << 0)  /**< Header buffer
> overflow.
> >>> */ [Helin] It indicates the header buff size is not enough, but not
> >>> means hardware cannot process the packet received. It is a good hint
> >>> for the users to provide larger size of header buffers. I also
> >>> prefer to keep it
> >> there, and just assign new value to it.
> >>
> >> Same for this one.
> > It is a bit different from previous one, it always has one header
> > buffer, this flag means the buffer size is not enough for the header.
> > These two flags are because of buffer size or number of buffers. The
> > mbufs are prepared in application or up layer software. If these two
> > flags occur, it is easier for up layer software to debug, and know
> > different buffers are needed. They do not need to debug PMD, as they
> generally don't want to do.
> 
> You say it's easier for the software to debug, but I cannot see the 
> difference.
> When it's a statistics counter, you just have to use rte_eth_xstats_get(), 
> which is
> an equivalent of "ethtool -S iface"
> which gives all the hardware statistics. It will work for any driver and any
> application.
> 
> If we add these flags, the application have to know about all these specific 
> flags
> and how to handle them.
OK. I agree with you not all errors need to be reported as flags. But I'd 
prefer to
add those flags which is useful for applications to bypass the errors.

> 
> >>> #define PKT_RX_RECIP_ERR     (0ULL << 0)  /**< Hardware processing
> error.
> >> */
> >>> [Helin] In the latest data sheet, it is not opened to external users.
> >>> So we can just remove it from here.
> >>
> >> ok
> >>
> >>> #define PKT_RX_MAC_ERR       (0ULL << 0)  /**< MAC error. */
> >>> [Helin] This indicates a real hardware error happens.
> >>
> >> And what is the content of the mbuf data in this case? Does the
> >> application really need an mbuf?
> > Mbuf contains both the data and other information. I prefer to let the
> > up layer software to decide how to deal with the packet, no matter it
> > is correct or bad. In addition, even hardware errors happened, it
> > still can set a special bit to enable storing the whole packet to the
> > mbuf, for debug purpose. Hardware error bit can be used for all hardware
> errors. As we do not have one there, why not add one?
> 
> You say "let the up layer software to decide how to deal with the packet, no
> matter it is correct or bad". But what can do an application with a packet if 
> it does
> not know if the data is correct or bad?
Mbuf flags can tell him if the data is good or bad, if good, more info can be 
seen in flags.

> 
> Regards,
> Olivier

Yes, flags should be a tradeoff, we can provide useful flags in mbuf, but not 
all.

Regards,
Helin

[dpdk-dev] [PATCH] i40e: Use one bit flag for all hardware detected RX packet errors

Reply via email to