On Wed, May 04, 2016 at 03:16:44PM +0200, Marc-André Lureau wrote: > Hi > > On Mon, May 2, 2016 at 2:04 PM, Michael S. Tsirkin <m...@redhat.com> wrote: > > On Mon, May 02, 2016 at 01:29:18PM +0200, Marc-André Lureau wrote: > >> Hi > >> > >> On Mon, May 2, 2016 at 12:45 PM, Michael S. Tsirkin <m...@redhat.com> > >> wrote: > >> > 1. Graceful disconnect > >> > - One should be able to do vmstop, disconnect, connect then vm start > >> > This looks like a nice intermediate step. > >> > - Why would we always assume it's always remote initiating the > >> > disconnect? > >> > Monitor commands for disconnecting seem called for. > >> > >> Those two solutions involve VM management. This looks more complex to > >> communicate/synchronize vhost-user backend & vm management & qemu. The > >> use case I cover is request from the backend to shutdown, > > > > Right, but flushing buffers + closing the socket looks like > > a cleaner interface than a ton of messages going back and forth. > > What do you mean by "a ton of messages"? It adds > VHOST_USER_SET_SLAVE_FD (generic), and VHOST_USER_SLAVE_SHUTDOWN. The > amount of work to flush and close is the same regardless. I figured > later that if we refactor vhost-user socket handling, we may be able > to accept request from the main channel socket, without extra "slave > channel". > > > > >> because > >> that's what the users wanted (and it is already possible to stop the > >> backend and disconnect it from qemu, we would only need to know when, > >> with a new command..) > > > > You assume the backend has a monitor interface to disconnect though. > > That's not a given. > > What do you mean? The backend must have a way to request to close/quit > indeed. That's outside of scope how the backend get this information > (via signals or other means). It's external, having this information > from VM management layer is the same (someone has to trigger this > somehow).
Correct. So for symmetry if nothing else, besides handling slave close request, we should be able to initiate close from qemu with a new command, and get event when not connected. Afterwards client can be killed with -9 as it's no longer connected to qemu. > >> > 3. Running while disconnected > >> > - At the moment, we wait on vm start for remote to connect, > >> > if we support disconnecting backend without stopping > >> > we probably should also support starting it without waiting > >> > for connection > >> > >> That's what Tetsuya proposed in its initial series, but handling the > >> flags was quite tedious. > > > > That would be up to management. E.g. let backend export the list > > of flags it supports in some file, and apply to QEMU. > > That makes me worry that such unfriendly connections details have to > spread outside of vhost-user to VM management layer, making usage & > maintenance harder for no clear benefit. It's a similar concern you > have with "the backend has a monitor interface", here "the backend > must have an introspection interface" or at least export vhost-user > details somehow. How can we start VM before backend connects otherwise? Have better ideas? > > > >> I think this can be considered easily a > >> seperate enhancement. What I proposed is to keep waiting for the > >> initial connect, and check the flags remains compatible on reconnect. > > > > Seems asymmetrical unless we stop the vm. > > That's the point, there will be time with and without the backend if > we keep the VM running. > > >> > - Guests expect tx buffers to be used in a timely manner, thus: > >> > - If vm is running we must use them in qemu, maybe discarding packets > >> > in the process. > >> > There already are commands for link down, I'm not sure there's value > >> > in doing this automatically in qemu. > >> > >> Testing doesn't show such buffer issues when the link is down (this > >> can be tested with vubr example above) > > > > Not enough testing then - it's a race condition: buffers can be sent > > before link down. > > Ok, I'll do more testing. In all cases, looks reasonable to discard. Discard has some issues - for example processing ring in qemu sometimes exposes us to more security risks. > > > >> > - Alternatively, we should also have an option to stop vm automatically > >> > (like we do on > >> > disk errors) to reduce number of dropped packets. > >> > >> Why not, we would need to know if this is actually useful. > >> > >> > > >> > 4. Reconnecting > >> > - When acting as a server, we might want an option to go back to > >> > listening state, but it should not be the only option, > >> > there should be a monitor command for moving it back to > >> > listening state. > >> > - When acting as a client, auto-reconnect might be handy sometimes, but > >> > should not be the only > >> > option, there should be a way to manually request connect, possibly to > >> > a different target, so a monitor command for re-connecting seems > >> > called for. > >> > - We'll also need monitor events for disconnects so management knows it > >> > must re-connect/restart listening. > >> > - If we stopped VM, there could be an option to restart on reconnect. > >> > >> That's all involving a third party, adding complexity but the benefit > >> isn't so clear. > > > > It's rather clear to me. Let's assume you want to restart bridge, > > so you trigger disconnect. > > If qemu auto-reconnects there's a race as it might re-connect > > to the old bridge. > > I would say that race can trivially be avoided, so it's a backend bug. How do you avoid it? > > Management is required to make this robust, auto-reconnect > > is handy for people bypassing management. > > tbh, I don't like autoreconnect. My previous series didn't include > this and assumed the feature would be supported only when qemu is > configured to be the server. I added reconnect upon request by users. I don't have better solutions so OK I guess. > -- > Marc-André Lureau