On Mon, Jan 23, 2023 at 05:52:17PM +0200, Anton Kuchin wrote: > > On 23/01/2023 16:09, Stefan Hajnoczi wrote: > > On Sun, 22 Jan 2023 at 11:18, Michael S. Tsirkin <m...@redhat.com> wrote: > > > On Sun, Jan 22, 2023 at 06:09:40PM +0200, Anton Kuchin wrote: > > > > On 22/01/2023 16:46, Michael S. Tsirkin wrote: > > > > > On Sun, Jan 22, 2023 at 02:36:04PM +0200, Anton Kuchin wrote: > > > > > > > > This flag should be set when qemu don't need to worry about any > > > > > > > > external state stored in vhost-user daemons during migration: > > > > > > > > don't fail migration, just pack generic virtio device states to > > > > > > > > migration stream and orchestrator guarantees that the rest of > > > > > > > > the > > > > > > > > state will be present at the destination to restore full > > > > > > > > context and > > > > > > > > continue running. > > > > > > > Sorry I still do not get it. So fundamentally, why do we need > > > > > > > this property? > > > > > > > vhost-user-fs is not created by default that we'd then need > > > > > > > opt-in to > > > > > > > the special "migrateable" case. > > > > > > > That's why I said it might make some sense as a device property > > > > > > > as qemu > > > > > > > tracks whether device is unplugged for us. > > > > > > > > > > > > > > But as written, if you are going to teach the orchestrator about > > > > > > > vhost-user-fs and its special needs, just teach it when to > > > > > > > migrate and > > > > > > > where not to migrate. > > > > > > > > > > > > > > Either we describe the special situation to qemu and let qemu > > > > > > > make an intelligent decision whether to allow migration, > > > > > > > or we trust the orchestrator. And if it's the latter, then > > > > > > > 'migrate' > > > > > > > already says orchestrator decided to migrate. > > > > > > The problem I'm trying to solve is that most of vhost-user devices > > > > > > now block migration in qemu. And this makes sense since qemu can't > > > > > > extract and transfer backend daemon state. But this prevents us from > > > > > > updating qemu executable via local migration. So this flag is > > > > > > intended more as a safety check that says "I know what I'm doing". > > > > > > > > > > > > I agree that it is not really necessary if we trust the orchestrator > > > > > > to request migration only when the migration can be performed in a > > > > > > safe way. But changing the current behavior of vhost-user-fs from > > > > > > "always blocks migration" to "migrates partial state whenever > > > > > > orchestrator requests it" seems a little dangerous and can be > > > > > > misinterpreted as full support for migration in all cases. > > > > > It's not really different from block is it? orchestrator has to > > > > > arrange > > > > > for backend migration. I think we just assumed there's no use-case > > > > > where > > > > > this is practical for vhost-user-fs so we blocked it. > > > > > But in any case it's orchestrator's responsibility. > > > > Yes, you are right. So do you think we should just drop the blocker > > > > without adding a new flag? > > > I'd be inclined to. I am curious what do dgilbert and stefanha think > > > though. > > If the migration blocker is removed, what happens when a user attempts > > to migrate with a management tool and/or vhost-user-fs server > > implementation that don't support migration? > > There will be no matching fuse-session in destination endpoint so all > requests to this fs will fail until it is remounted from guest to > send new FUSE_INIT message that does session setup.
The point of the migration blocker is to prevent breaking running guests. Situations where a migration completes but results in a broken guest are problematic for users (especially when they are not logged in to guests and able to fix them interactively). If a command-line option is set to override the blocker, that's fine. But there needs to be a blocker by default if external knowledge is required to decide whether or not it's safe to migrate. > > > > Anton: Can you explain how stateless migration will work on the > > vhost-user-fs back-end side? Is it reusing vhost-user reconnect > > functionality or introducing a new mode for stateless migration? I > > guess the vhost-user-fs back-end implementation is required to > > implement VHOST_F_LOG_ALL so dirty memory can be tracked and drain all > > in-flight requests when vrings are stopped? > > It reuses existing vhost-user reconnect code to resubmit inflight > requests. > Sure, backend needs to support this feature - presence of required > features is checked by generic vhost and vhost-user code during init > and if something is missing migration blocker is assigned to the > device (not a static one in vmstate that I remove in this patch, but > other per-device kind of blocker). This is not enough detail. Please post the QEMU patches before we commit to a user-visible vhost-user-fs command-line parameter. I think what you're trying is a new approach that can be made to work. However, both vhost-user and migration are fragile and you have not explained how it will work. I don't have confidence in merging this incrementally because I'm afraid of committing to user-visible or vhost-user protocol behavior that turns out to be broken just a little while later. The kind of detail I was hoping to hear was, for example, how vhost_user_blk_device_realize() blocks and tries to reconnect 3 times. Does this approach work for stateless migration? The destination QEMU is launched before the source QEMU disconnects from the vhost-user UNIX domain socket, so I guess the destination QEMU cannot connect in the current version of vhost-user reconnect as implemented by QEMU's vhost-user-blk device. Have you come up with a new handover protocol? Stefan
signature.asc
Description: PGP signature