>>   static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
>>   {
>>         struct virtio_net_hdr_mrg_rxbuf *hdr;
>> @@ -1130,9 +1172,11 @@ static netdev_tx_t start_xmit(struct sk_buff *skb,
>> struct net_device *dev)
>>         int err;
>>         struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);
>>         bool kick = !skb->xmit_more;
>> +       bool use_napi = sq->napi.weight;
>>         /* Free up any pending old buffers before queueing new ones. */
>> -       free_old_xmit_skbs(sq);
>> +       if (!use_napi)
>> +               free_old_xmit_skbs(sq);
>
>
> I'm not sure this is best or even correct. Consider we clean xmit packets
> speculatively in virtnet_poll_tx(), we need call free_old_xmit_skbs()
> unconditionally. This can also help to reduce the possible of napi
> rescheduling in virtnet_poll_tx().

Because of the use of trylock there. Absolutely, thanks! Perhaps I should
only use trylock in the opportunistic clean path from the rx softirq and
full locking in the tx softirq.

I previously observed that cleaning here would, counterintuitively,
reduce efficiency. It reverted the improvements of cleaning transmit
completions from the receive softirq. Going through my data, I did
not observe this regression anymore on the latest patchset.

Let me test again, with and without the new
virtqueue_enable_cb_delayed patch. Perhaps that made a
difference.

Reply via email to