On Wed, Jul 07, 2021 at 05:58:50PM +0300, Denis Plotnikov wrote:
> 
> On 07.07.2021 17:39, Michael S. Tsirkin wrote:
> > On Wed, Jul 07, 2021 at 03:19:20PM +0300, Denis Plotnikov wrote:
> > > On 07.07.2021 13:10, Michael S. Tsirkin wrote:
> > > > On Fri, Jun 25, 2021 at 11:52:10AM +0300, Denis Plotnikov wrote:
> > > > > On vhost-user-blk migration, qemu normally sends a number of commands
> > > > > to enable logging if VHOST_USER_PROTOCOL_F_LOG_SHMFD is negotiated.
> > > > > Qemu sends VHOST_USER_SET_FEATURES to enable buffers logging and
> > > > > VHOST_USER_SET_FEATURES per each started ring to enable "used ring"
> > > > > data logging.
> > > > > The issue is that qemu doesn't wait for reply from the vhost daemon
> > > > > for these commands which may result in races between qemu expectation
> > > > > of logging starting and actual login starting in vhost daemon.
> > > > Could you be more explicit please? What kind of race have you
> > > > observed? Getting a reply slows down the setup considerably and
> > > > should not be done lightly.
> > > I'm talking about the vhost-user-blk case. On migration setup, we enable
> > > logging by sending VHOST_USER_SET_FEATURES. The command doesn't arrive to 
> > > a
> > > vhost-user-blk daemon immediately and the daemon needs some time turn the
> > > logging on internally. If qemu doesn't wait for reply, after sending the
> > > command qemu may start migrate memory pages. At this time the logging may
> > > not be actually turned on in the daemonĀ  but some guest pages, which the
> > > daemon is about to write to, may be already transferred without logging 
> > > to a
> > > destination. Since the logging wasn't turned on, those pages won't be
> > > transferred again as dirty. So we may end up with corrupted data on the
> > > destination.
> > > 
> > > Have I managed to explain the case clearly?
> > > 
> > > Thanks!
> > > 
> > > Denis
> > OK so this is just about enabling logging. It would be cleaner to
> > defer migrating memory until response ... if that is too hard,
> > at least document why we are doing this please.
> > And, let's wait for an ack just in that case then - why not?
> > 
> > And what about VHOST_USER_SET_PROTOCOL_FEATURES?
> 
> The code uses the same path for both VHOST_USER_SET_PROTOCOL_FEATURES and
> VHOST_USER_SET_FEATURES via vhost_user_set_u64(). So, I decided to suggest
> adding reply to both of them, so both feature setting commands work
> similarly as it doesn't contradicts with vhost-user spec.
> 
> I'm not sure that it worth doing that, so if you think it's not I'll just
> remove them.
> 
> 
> Denis


I'm inclined to say let's not add to the latency of setting up the
device unnecessarily.

> > 
> > 
> > > > Thanks!
> > > > 
> > > > > To resolve this issue, this patch makes qemu wait for the commands 
> > > > > result
> > > > > explicilty if VHOST_USER_PROTOCOL_F_REPLY_ACK is negotiated.
> > > > > Also, this patch adds the reply waiting for 
> > > > > VHOST_USER_SET_PROTOCOL_FEATURES
> > > > > command to make the features setting functions work similary.
> > > > > 
> > > > > Signed-off-by: Denis Plotnikov <den-plotni...@yandex-team.ru>
> > > > > ---
> > > > >    hw/virtio/vhost-user.c | 20 ++++++++++++++++++++
> > > > >    1 file changed, 20 insertions(+)
> > > > > 
> > > > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> > > > > index ee57abe04526..e47b82adab00 100644
> > > > > --- a/hw/virtio/vhost-user.c
> > > > > +++ b/hw/virtio/vhost-user.c
> > > > > @@ -1105,10 +1105,20 @@ static int vhost_user_set_vring_addr(struct 
> > > > > vhost_dev *dev,
> > > > >            .hdr.size = sizeof(msg.payload.addr),
> > > > >        };
> > > > > +    bool reply_supported = virtio_has_feature(dev->protocol_features,
> > > > > +                                              
> > > > > VHOST_USER_PROTOCOL_F_REPLY_ACK);
> > > > > +    if (reply_supported) {
> > > > > +        msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
> > > > > +    }
> > > > > +
> > > > >        if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
> > > > >            return -1;
> > > > >        }
> > > > > +    if (reply_supported) {
> > > > > +        return process_message_reply(dev, &msg);
> > > > > +    }
> > > > > +
> > > > >        return 0;
> > > > >    }
> > > > > @@ -1297,10 +1307,20 @@ static int vhost_user_set_u64(struct 
> > > > > vhost_dev *dev, int request, uint64_t u64)
> > > > >            .hdr.size = sizeof(msg.payload.u64),
> > > > >        };
> > > > > +    bool reply_supported = virtio_has_feature(dev->protocol_features,
> > > > > +                                              
> > > > > VHOST_USER_PROTOCOL_F_REPLY_ACK);
> > > > > +    if (reply_supported) {
> > > > > +        msg.hdr.flags |= VHOST_USER_NEED_REPLY_MASK;
> > > > > +    }
> > > > > +
> > > > >        if (vhost_user_write(dev, &msg, NULL, 0) < 0) {
> > > > >            return -1;
> > > > >        }
> > > > > +    if (reply_supported) {
> > > > > +        return process_message_reply(dev, &msg);
> > > > > +    }
> > > > > +
> > > > >        return 0;
> > > > >    }
> > > > > -- 
> > > > > 2.25.1


Reply via email to