On Tue, 2026-05-12 at 17:39 +0200, Stefano Garzarella wrote:
> On Tue, May 12, 2026 at 02:32:14PM +0000, Polina Vishneva wrote:
> > On Mon, 2026-05-11 at 17:56 +0200, Stefano Garzarella wrote:
> > > On Mon, May 11, 2026 at 04:56:10PM +0200, Polina Vishneva wrote:
> > > > From: "Denis V. Lunev" <[email protected]>
> > > > 
> > > > When the host initiates an AF_VSOCK connect() to a guest that has not
> > > > yet loaded the virtio-vsock transport (i.e. still booting), the caller
> > > > blocks for VSOCK_DEFAULT_CONNECT_TIMEOUT (2 seconds), because
> > > > vhost_transport_do_send_pkt() silently exits when
> > > > vhost_vq_get_backend(vq) returns NULL.
> > > 
> > > Can SO_VM_SOCKETS_CONNECT_TIMEOUT helps on this?
> > 
> > It can, but it might be difficult to find a correct timeout.
> > 
> > And, generally, there's no way to distinguish "the guest hasn't yet 
> > initialized
> > the vq" from "the guest is up and running, but didn't reply to connect() in
> > time". That's exactly what this patch is attempting to fix.
> 
> Okay, so please mention this in the commit message, I mean why 
> SO_VM_SOCKETS_CONNECT_TIMEOUT can't really help.

Will do.

> 
> > 
> > > 
> > > > 
> > > > If the guest doesn't start listening within this timeout, connect()
> > > > returns ETIMEDOUT.
> > > > 
> > > > This delay is usually pointless and it doesn't well align with our
> 
> I still don't understand why this is pointless. If an application wants 
> to wait while sleeping, it can simply increase the timeout long enough 
> to wait for the VM to start up and use a single `connect()` call, 
> instead of continuing to try and wasting CPU cycles unnecessarily.
> 
> Hmm, or maybe not, because the driver will definitely be initialized 
> before the application that wants to listen on that port, so it will 
> respond that no one is listening, and the `connect()` call will fail 
> with an `ECONNRESET` error in any case. Right?

That's the case indeed.

> 
> If it is the case, is the following line in the commit description 
> correct?
> 
>      If the guest doesn't start listening within this timeout, connect()
>      returns ETIMEDOUT.
> 
> I mean, also if the application starts to listen within the timeout, I 
> think the connect() will fail in any case as I pointed out above (this 
> should be another point in favour of this change)

Yes, the commit message should be updated, as well as the code comment.

> 
> 
> BTW, I think we should explain this more clearly both here and briefly 
> in the code as well.

Definitely.

> 
> > > > behavior at other initialization stages: for example, if a connection is
> > > > attempted when the guest driver is already loaded, but when nothing is
> > > > listening yet, it returns ECONNRESET immediately without any wait.
> > > > 
> > > > Fix this by checking the RX virtqueue backend in
> > > > vhost_transport_send_pkt() before queuing. If the backend is NULL,
> > > > return -ECONNREFUSED immediately.
> > > > 
> > > > Signed-off-by: Denis V. Lunev <[email protected]>
> > > > Co-developed-by: Polina Vishneva <[email protected]>
> > > > Signed-off-by: Polina Vishneva <[email protected]>
> > > > ---
> > > > drivers/vhost/vsock.c | 10 ++++++++++
> > > > 1 file changed, 10 insertions(+)
> > > > 
> > > > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> > > > index 1d8ec6bed53e..a3f218292c3a 100644
> > > > --- a/drivers/vhost/vsock.c
> > > > +++ b/drivers/vhost/vsock.c
> > > > @@ -302,6 +302,16 @@ vhost_transport_send_pkt(struct sk_buff *skb, 
> > > > struct net *net)
> > > >                 return -ENODEV;
> > > >         }
> > > > 
> > > > +       /* Fast-fail if the guest hasn't enabled the RX vq yet. Reading
> > > > +        * private_data without vq->mutex is deliberate: even if the 
> > > > backend becomes
> > > > +        * NULL right after that check, do_send_pkt() checks it under 
> > > > the mutex.
> > > > +        */
> > > > +       if (!data_race(READ_ONCE(vsock->vqs[VSOCK_VQ_RX].private_data)))
> > > 
> > > Why not using vhost_vq_get_backend() ?
> > 
> > Because it locks the mutex, which is slow and unacceptable in this hot 
> > path.
> 
> ehm, sorry, which mutex are you talking about?
> 
> I see just a comment about the mutex to be acquired by the caller, but I 
> don't see any lock there.

Apparently the comment in vhost.h says "Context: Need to call with vq->mutex
acquired.", but I guess we're safe to ignore this and use it instead of
accessing private_data manually, thanks for pointing this out.

> 
> > 
> > > 
> > > Also is READ_ONCE() okay without WRITE_ONCE() where it is set ?
> > 
> > It's racy, but as described here in the comment and in the commit message,
> > any possible race outcome is covered by the subsequent checks.
> 
> Okay, so what is the point to call READ_ONCE()?

Probably none, it was just there in the initial patch version, and I've decided
not to drop it when adding data_race(). Will drop.

> 
> > 
> > > > {
> > > > +               rcu_read_unlock();
> > > > +               kfree_skb(skb);
> > > > +               return -ECONNREFUSED;
> > > 
> > > This is a generic send_pkt, is it okay to return ECONNREFUSED in any
> > > case?
> > 
> > EHOSTUNREACH would probably be better.
> > All the current send_pkt functions only return ENODEV, but it has different
> > semantics: they mean that the local device isn't yet ready, while there 
> > we're
> > dealing with the opposite end not being ready.
> 
> In the AF_VSOCK prespective, I see ENODEV like the transport is not 
> ready, so I think it can eventually fit here too, but also EHOSTUNREACH 
> is fine, for sure better than ECONNREFUSED.

EHOSTUNREACH is indeed a better fit, agreed.

> 
> Thanks,
> Stefano
> 
> > 
> > Best regards, Polina.
> > 
> > > 
> > > Thanks,
> > > Stefano
> > > 
> > > > +       }
> > > > +
> > > >         if (virtio_vsock_skb_reply(skb))
> > > >                 atomic_inc(&vsock->queued_replies);
> > > > 
> > > > 
> > > > base-commit: 8ab992f815d6736b5c7a6f5fd7bfe7bc106bb3dc
> > > > --
> > > > 2.53.0
> > > > 

Reply via email to