> -----Original Message----- > From: Jason Wang [mailto:[email protected]] > Sent: Tuesday, December 22, 2020 12:41 PM > To: Willem de Bruijn <[email protected]>; wangyunjian > <[email protected]> > Cc: Network Development <[email protected]>; Michael S. Tsirkin > <[email protected]>; [email protected]; Lilijun (Jerry) > <[email protected]>; chenchanghu <[email protected]>; > xudingke <[email protected]>; huangbin (J) > <[email protected]> > Subject: Re: [PATCH net v2 2/2] vhost_net: fix high cpu load when sendmsg > fails > > > On 2020/12/22 上午7:07, Willem de Bruijn wrote: > > On Wed, Dec 16, 2020 at 3:20 AM wangyunjian<[email protected]> > wrote: > >> From: Yunjian Wang<[email protected]> > >> > >> Currently we break the loop and wake up the vhost_worker when sendmsg > >> fails. When the worker wakes up again, we'll meet the same error. > > The patch is based on the assumption that such error cases always > > return EAGAIN. Can it not also be ENOMEM, such as from tun_build_skb? > > > >> This will cause high CPU load. To fix this issue, we can skip this > >> description by ignoring the error. When we exceeds sndbuf, the return > >> value of sendmsg is -EAGAIN. In the case we don't skip the > >> description and don't drop packet. > > the -> that > > > > here and above: description -> descriptor > > > > Perhaps slightly revise to more explicitly state that > > > > 1. in the case of persistent failure (i.e., bad packet), the driver > > drops the packet 2. in the case of transient failure (e.g,. memory > > pressure) the driver schedules the worker to try again later > > > If we want to go with this way, we need a better time to wakeup the worker. > Otherwise it just produces more stress on the cpu that is what this patch > tries > to avoid.
The problem was initially discovered when a VM sent an abnormal packet, which causing the VM can't send packets anymore. After this patch "feb8892cb441c7 vhost_net: conditionally enable tx polling", there have also been high CPU consumption issues. It is the first problem that I am actually more concerned with and want to solve. Thanks > > Thanks > > > > > >
