On Wed, May 13, 2026 at 09:44:49AM +0000, Polina Vishneva wrote: > On Tue, 2026-05-12 at 12:02 -0400, Michael S. Tsirkin wrote: > > On Tue, May 12, 2026 at 05:39:48PM +0200, Stefano Garzarella wrote: > > > On Tue, May 12, 2026 at 02:32:14PM +0000, Polina Vishneva wrote: > > > > On Mon, 2026-05-11 at 17:56 +0200, Stefano Garzarella wrote: > > > > > On Mon, May 11, 2026 at 04:56:10PM +0200, Polina Vishneva wrote: > > > > > > From: "Denis V. Lunev" <[email protected]> > > > > > > > > > > > > When the host initiates an AF_VSOCK connect() to a guest that has > > > > > > not > > > > > > yet loaded the virtio-vsock transport (i.e. still booting), the > > > > > > caller > > > > > > blocks for VSOCK_DEFAULT_CONNECT_TIMEOUT (2 seconds), because > > > > > > vhost_transport_do_send_pkt() silently exits when > > > > > > vhost_vq_get_backend(vq) returns NULL. > > > > > > > > > > Can SO_VM_SOCKETS_CONNECT_TIMEOUT helps on this? > > > > > > > > It can, but it might be difficult to find a correct timeout. > > > > > > > > And, generally, there's no way to distinguish "the guest hasn't yet > > > > initialized > > > > the vq" from "the guest is up and running, but didn't reply to > > > > connect() in > > > > time". That's exactly what this patch is attempting to fix. > > > > > > Okay, so please mention this in the commit message, I mean why > > > SO_VM_SOCKETS_CONNECT_TIMEOUT can't really help. > > > > > > > > > > > > > > > > > > > > > > > > If the guest doesn't start listening within this timeout, connect() > > > > > > returns ETIMEDOUT. > > > > > > > > > > > > This delay is usually pointless and it doesn't well align with our > > > > > > I still don't understand why this is pointless. If an application wants to > > > wait while sleeping, it can simply increase the timeout long enough to > > > wait > > > for the VM to start up and use a single `connect()` call, instead of > > > continuing to try and wasting CPU cycles unnecessarily. > > > > > > Hmm, or maybe not, because the driver will definitely be initialized > > > before > > > the application that wants to listen on that port, so it will respond that > > > no one is listening, and the `connect()` call will fail with an > > > `ECONNRESET` > > > error in any case. Right? > > > > > > If it is the case, is the following line in the commit description > > > correct? > > > > > > If the guest doesn't start listening within this timeout, connect() > > > returns ETIMEDOUT. > > > > > > I mean, also if the application starts to listen within the timeout, I > > > think > > > the connect() will fail in any case as I pointed out above (this should be > > > another point in favour of this change) > > > > > > > > > BTW, I think we should explain this more clearly both here and briefly in > > > the code as well. > > > > > > > > > behavior at other initialization stages: for example, if a > > > > > > connection is > > > > > > attempted when the guest driver is already loaded, but when nothing > > > > > > is > > > > > > listening yet, it returns ECONNRESET immediately without any wait. > > > > > > > > > > > > Fix this by checking the RX virtqueue backend in > > > > > > vhost_transport_send_pkt() before queuing. If the backend is NULL, > > > > > > return -ECONNREFUSED immediately. > > > > > > > > > > > > Signed-off-by: Denis V. Lunev <[email protected]> > > > > > > Co-developed-by: Polina Vishneva <[email protected]> > > > > > > Signed-off-by: Polina Vishneva <[email protected]> > > > > > > --- > > > > > > drivers/vhost/vsock.c | 10 ++++++++++ > > > > > > 1 file changed, 10 insertions(+) > > > > > > > > > > > > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c > > > > > > index 1d8ec6bed53e..a3f218292c3a 100644 > > > > > > --- a/drivers/vhost/vsock.c > > > > > > +++ b/drivers/vhost/vsock.c > > > > > > @@ -302,6 +302,16 @@ vhost_transport_send_pkt(struct sk_buff *skb, > > > > > > struct net *net) > > > > > > return -ENODEV; > > > > > > } > > > > > > > > > > > > + /* Fast-fail if the guest hasn't enabled the RX vq yet. Reading > > > > > > + * private_data without vq->mutex is deliberate: even if the > > > > > > backend becomes > > > > > > + * NULL right after that check, do_send_pkt() checks it under > > > > > > the mutex. > > > > > > + */ > > > > > > + if (!data_race(READ_ONCE(vsock->vqs[VSOCK_VQ_RX].private_data))) > > > > > > > > > > Why not using vhost_vq_get_backend() ? > > > > > > > > Because it locks the mutex, which is slow and unacceptable in this hot > > > > path. > > > > > > ehm, sorry, which mutex are you talking about? > > > > > > I see just a comment about the mutex to be acquired by the caller, but I > > > don't see any lock there. > > > > > > > > > > > > > > > > > Also is READ_ONCE() okay without WRITE_ONCE() where it is set ? > > > > > > > > It's racy, but as described here in the comment and in the commit > > > > message, > > > > any possible race outcome is covered by the subsequent checks. > > > > > > Okay, so what is the point to call READ_ONCE()? > > > > > > > > > > > > > { > > > > > > + rcu_read_unlock(); > > > > > > + kfree_skb(skb); > > > > > > + return -ECONNREFUSED; > > > > > > > > > > This is a generic send_pkt, is it okay to return ECONNREFUSED in any > > > > > case? > > > > > > > > EHOSTUNREACH would probably be better. > > > > All the current send_pkt functions only return ENODEV, but it has > > > > different > > > > semantics: they mean that the local device isn't yet ready, while there > > > > we're > > > > dealing with the opposite end not being ready. > > > > > > In the AF_VSOCK prespective, I see ENODEV like the transport is not ready, > > > so I think it can eventually fit here too, but also EHOSTUNREACH is fine, > > > for sure better than ECONNREFUSED. > > > > > > Thanks, > > > Stefano > > > > I think it's worth trying to do the same thing with e.g. TCP > > and see what error, if any, we get. Match that. > > This case is not directly applicable to TCP: in TCP, there's no out-of-band > way > to detect the "host up, but not initialized yet and not ready for connections" > state: this could theoretically be ENOPROTOOPT, but no real TCP stack > implement > this, because replying with ICMP_PROT_UNREACH requires a TCP stack, which is > exactly the thing that isn't up. > > So, in real world, a similar situation with TCP would result in ETIMEDOUT.
Then it just might be best to keep the current behaviour which seems to match that pretty closely? > > > > > > > > > > > > Best regards, Polina. > > > > > > > > > > > > > > Thanks, > > > > > Stefano > > > > > > > > > > > + } > > > > > > + > > > > > > if (virtio_vsock_skb_reply(skb)) > > > > > > atomic_inc(&vsock->queued_replies); > > > > > > > > > > > > > > > > > > base-commit: 8ab992f815d6736b5c7a6f5fd7bfe7bc106bb3dc > > > > > > -- > > > > > > 2.53.0 > > > > > >

