On 3/5/24 18:13, Kevin Traynor wrote:
> On 05/03/2024 15:22, Ilya Maximets wrote:
>> It's hard to debug situations where driver rejects packets for some
>> reason.  Dumping out the mbuf should help with debugging.
>>
>> Sample output looks like this:
>>
>>   |netdev_dpdk(pmd-c03/id:8)|DBG|ovs-p1: Invalid packet:
>>   dump mbuf at 0x1180bce140, iova=0x2cb7ce400, buf_len=2176
>>     pkt_len=64, ol_flags=0x2, nb_segs=1, port=65535, ptype=0
>>     segment at 0x1180bce140, data=0x1180bce580, len=90, off=384, refcnt=1
>>     Dump data at [0x1180bce580], len=64
>>   00000000: 33 33 00 00 00 16 AA 27 91 F9 4D 96 86 DD 60 00 | 
>> 33.....'..M...`.
>>   00000010: 00 00 00 24 00 01 00 00 00 00 00 00 00 00 00 00 | 
>> ...$............
>>   00000020: 00 00 00 00 00 00 FF 02 00 00 00 00 00 00 00 00 | 
>> ................
>>   00000030: 00 00 00 00 00 16 3A 00 05 02 00 00 01 00 8F 00 | 
>> ......:.........
>>
> 
> Hi Ilya,
> 
> Thanks for the patch, that could be useful. One comment below.
> 
>> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>
>> ---
>>  lib/netdev-dpdk.c | 33 +++++++++++++++++++++++++++++++++
>>  1 file changed, 33 insertions(+)
>>
>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>> index 45f61930d..662b41d5c 100644
>> --- a/lib/netdev-dpdk.c
>> +++ b/lib/netdev-dpdk.c
>> @@ -2664,6 +2664,35 @@ netdev_dpdk_prep_hwol_batch(struct netdev_dpdk *dev, 
>> struct rte_mbuf **pkts,
>>      return cnt;
>>  }
>>  
>> +static void
>> +netdev_dpdk_mbuf_dump(const char *prefix, const char *message,
>> +                      const struct rte_mbuf *mbuf)
>> +{
>> +    static struct vlog_rate_limit dump_rl = VLOG_RATE_LIMIT_INIT(5, 5);
>> +    char *response = NULL;
>> +    FILE *stream;
>> +    size_t size;
>> +
>> +    if (VLOG_DROP_DBG(&dump_rl)) {
>> +        return;
>> +    }
>> +
>> +    stream = open_memstream(&response, &size);
>> +    if (!stream) {
>> +        VLOG_ERR("Unable to open memstream for mbuf dump: %s.",
>> +                 ovs_strerror(errno));
>> +        return;
>> +    }
>> +
>> +    rte_pktmbuf_dump(stream, mbuf, rte_pktmbuf_pkt_len(mbuf));
>> +
>> +    fclose(stream);
>> +
>> +    VLOG_DBG(prefix ? "%s: %s:\n%s" : "%s%s:\n%s",
>> +             prefix ? prefix : "", message, response);
>> +    free(response);
>> +}
>> +
>>  /* Tries to transmit 'pkts' to txq 'qid' of device 'dev'.  Takes ownership 
>> of
>>   * 'pkts', even in case of failure.
>>   *
>> @@ -2680,6 +2709,10 @@ netdev_dpdk_eth_tx_burst(struct netdev_dpdk *dev, int 
>> qid,
>>          VLOG_WARN_RL(&rl, "%s: Output batch contains invalid packets. "
>>                       "Only %u/%u are valid: %s", netdev_get_name(&dev->up),
>>                       nb_tx_prep, cnt, rte_strerror(rte_errno));
>> +        for (size_t i = nb_tx_prep; i < cnt; i++) {
>> +            netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up),
>> +                                  "Invalid packet", pkts[i]);
> 
> rte_eth_tx_prepare() stops when it finds an invalid packet, so we don't
> know the status of subsequent packets except that they will not be sent.
> It might be confusing if we dump them and label as invalid.

Ah, interesting.  I just assumed that since netdev_dpdk_eth_tx_burst()
is calling it once, it will re-order the packets moving the bad ones
to the end like some other functions do.  But you're right, that's not
the case.  And the existing log message is actually not right either.

> 
> How about only dump the known invalid packet or else replace "Invalid
> packet" with, i == nb_tx_prep ? "Invalid packet" : "Not sent"
> 
> (maybe a better term than "Not sent")
> 
> What do you think ?

I don't think we need to dump the 'not sent' packets at all.  We don't
know if they are bad or not, so dumping them doesn't give any meaningful
information.  We can just dump the "First invalid packet", I suppose.
It is the one that failed the check.  E.g.:

    netdev_dpdk_mbuf_dump(netdev_get_name(&dev->up),
                          "First invalid packet", pkts[nb_tx_prep]);

In general, we should call the rte_eth_tx_prepare() multiple times
re-ordering the packets between calls, so we don't drop packets that are
actually good.  But, the probability of only a few packets in a batch to
be bad is very low, so it's probably fine to keep the code as it is.

What do you think?

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to