Re: [Qemu-devel] custom virt-io support (in user-mode-linux)
On Thu, 2019-05-23 at 15:41 +0100, Stefan Hajnoczi wrote: > > Also, not sure I understand how the client is started? > > The vhost-user device backend can be launched before QEMU. QEMU is > started with the UNIX domain socket path so it can connect. Hmm. I guess I'm confusing the terminology then - I thought qemu was the server and the backend was the client that connects to it. If it's the other way around, yeah, that makes things easier and certainly makes sense (you could have a daemon that implements something). > QEMU itself doesn't fork+exec the vhost-user device backend. It's > expected that the user or the management stack has already launched > the vhost-user device backend. Right. > > Do you know if there's a sample client/server somewhere? > > See contrib/libvhost-user in the QEMU source tree as well as the > vhost-user-blk and vhost-user-scsi examples in the contrib/ directory. Awesome, thanks! johannes
Re: [Qemu-devel] custom virt-io support (in user-mode-linux)
On Thu, May 23, 2019 at 3:25 PM Johannes Berg wrote: > Not sure I understand why there's all this stuff about multiple FDs, > once you have access to the guest's memory, why do you still need a > second (or more) FDs? The memory regions could be different files (maybe additional RAM was hotplugged later). > Also, not sure I understand how the client is started? The vhost-user device backend can be launched before QEMU. QEMU is started with the UNIX domain socket path so it can connect. QEMU itself doesn't fork+exec the vhost-user device backend. It's expected that the user or the management stack has already launched the vhost-user device backend. > Once we have a connection, I guess as a client I'd at the very least > have to handle > * VHOST_USER_GET_FEATURES and reply with the features, obviously, which >is in this case just VHOST_USER_F_PROTOCOL_FEATURES? > > * VHOST_USER_SET_FEATURES - not sure, what would that do? the master >sends VHOST_USER_GET_PROTOCOL_FEATURES which is with this feature >bit? Especially since it says: "Slave that reported >VHOST_USER_F_PROTOCOL_FEATURES must support this message even before >VHOST_USER_SET_FEATURES was called." > > * VHOST_USER_GET_PROTOCOL_FEATURES - looking at the list, most I don't >really need here, but OK > > * VHOST_USER_SET_OWNER - ?? > > * VHOST_USER_RESET_OWNER - ignore > > * VHOST_USER_SET_MEM_TABLE - store the data/FDs for later use, I guess > > * VHOST_USER_SET_VRING_NUM - store the data for later use > * VHOST_USER_SET_VRING_ADDR - dito > * VHOST_USER_SET_VRING_BASE - dito > * VHOST_USER_SET_VRING_KICK - start epoll on the FD (assuming there is >one, give up if not?) - well, if ring is >enabled? > * VHOST_USER_SET_VRING_CALL - ... > > I guess there might be better documentation on the ioctl interfaces? > > > Do you know if there's a sample client/server somewhere? See contrib/libvhost-user in the QEMU source tree as well as the vhost-user-blk and vhost-user-scsi examples in the contrib/ directory. Stefan
Re: [Qemu-devel] custom virt-io support (in user-mode-linux)
Hi Stefan, > Check out vhost-user. It's a protocol for running a subset of a VIRTIO > device's emulation in a separate process (usually just the data plane > with the PCI emulation and other configuration/setup still handled by > QEMU). Yes, I think that's basically what I'm looking for. > vhost-user uses a UNIX domain socket to pass file descriptors to shared > memory regions. This way the vhost-user device backend process has > access to guest RAM. > > This would be quite different for UML since my understanding is you > don't have guest RAM but actual host Linux processes, but vhost-user > might still give you ideas: > https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/vhost-user.rst;hb=HEAD I guess it could still be implemented. Do you know how qemu actually creates the shared memory region though? It's normal inside kernel memory, no? Ah, no, I see ... you have to give -mem-path and then the entire guest memory isn't allocated as anonymous memory but from a file, and then you can pass a descriptor to that file and effectively the client/slave of vhost-user can access the whole guest's memory. Interesting. Next you're going to want an IOMMU there, not just fake one, to protect against hostile virt-user client? Not that I care :-) UML in fact already maps all of its memory as a file (see arch/um/ create_mem_file()), so this part is easy. What confused me at first is how all this talks about the ioctl() interface, but I think I understand now - it's basically replacing ioctl() with talking to a client. So ultimately, it would actually seem "pretty simple". Not sure I understand why there's all this stuff about multiple FDs, once you have access to the guest's memory, why do you still need a second (or more) FDs? Also, not sure I understand how the client is started? Once we have a connection, I guess as a client I'd at the very least have to handle * VHOST_USER_GET_FEATURES and reply with the features, obviously, which is in this case just VHOST_USER_F_PROTOCOL_FEATURES? * VHOST_USER_SET_FEATURES - not sure, what would that do? the master sends VHOST_USER_GET_PROTOCOL_FEATURES which is with this feature bit? Especially since it says: "Slave that reported VHOST_USER_F_PROTOCOL_FEATURES must support this message even before VHOST_USER_SET_FEATURES was called." * VHOST_USER_GET_PROTOCOL_FEATURES - looking at the list, most I don't really need here, but OK * VHOST_USER_SET_OWNER - ?? * VHOST_USER_RESET_OWNER - ignore * VHOST_USER_SET_MEM_TABLE - store the data/FDs for later use, I guess * VHOST_USER_SET_VRING_NUM - store the data for later use * VHOST_USER_SET_VRING_ADDR - dito * VHOST_USER_SET_VRING_BASE - dito * VHOST_USER_SET_VRING_KICK - start epoll on the FD (assuming there is one, give up if not?) - well, if ring is enabled? * VHOST_USER_SET_VRING_CALL - ... I guess there might be better documentation on the ioctl interfaces? Do you know if there's a sample client/server somewhere? I guess we should implement the server in UML like it is in QEMU (unless we can figure out how to virtualize the time with HPET or something in QEMU) and then have our client and kernel driver for it... Thanks a lot! johannes
Re: [Qemu-devel] custom virt-io support (in user-mode-linux)
On Wed, May 22, 2019 at 03:02:38PM +0200, Johannes Berg wrote: > Hi, > > While my main interest is mostly in UML right now [1] I've CC'ed the > qemu and virtualization lists because something similar might actually > apply to other types of virtualization. > > I'm thinking about adding virt-io support to UML, but the tricky part is > that while I want to use the virt-io basics (because it's a nice > interface from the 'inside'), I don't actually want the stock drivers > that are part of the kernel now (like virtio-net etc.) but rather > something that integrates with wifi (probably building on hwsim). > > The 'inside' interfaces aren't really a problem - just have a specific > device ID for this, and then write a normal virtio kernel driver for it. > > The 'outside' interfaces are where my thinking breaks down right now. > > Looking at lkl, the outside is just all implemented in lkl as code that > gets linked to the library, so in UML terms it'd just be extra 'outside' > code like the timer handling or other netdev stuff we have today. > Looking at qemu, it's of course also implemented there, and then > interfaces with the real network, console abstraction, etc. > > However, like I said above, I really need something very custom and not > likely to make it upstream to any project (because what point is that if > you cannot connect to the rest of the environment I'm building), so I'm > thinking that perhaps it should be possible to write an abstract > 'outside' that lets you interact with it really from out-of-process? > Perhaps through some kind of shared memory segment? I think that gets > tricky with virt-io doing DMA (I think it does?) though, so that part > would have to be implemented directly and not out-of-process? > > But really that's why I'm asking - is there a better way than to just > link the device-side virt-io code into the same binary (be it lkl lib, > uml binary, qemu binary)? Hi Johannes, Check out vhost-user. It's a protocol for running a subset of a VIRTIO device's emulation in a separate process (usually just the data plane with the PCI emulation and other configuration/setup still handled by QEMU). vhost-user uses a UNIX domain socket to pass file descriptors to shared memory regions. This way the vhost-user device backend process has access to guest RAM. This would be quite different for UML since my understanding is you don't have guest RAM but actual host Linux processes, but vhost-user might still give you ideas: https://git.qemu.org/?p=qemu.git;a=blob_plain;f=docs/interop/vhost-user.rst;hb=HEAD Stefan signature.asc Description: PGP signature
Re: [Qemu-devel] custom virt-io support (in user-mode-linux)
On 22/05/2019 14:46, Johannes Berg wrote: Hi Anton, I'm thinking about adding virt-io support to UML, but the tricky part is that while I want to use the virt-io basics (because it's a nice interface from the 'inside'), I don't actually want the stock drivers that are part of the kernel now (like virtio-net etc.) but rather something that integrates with wifi (probably building on hwsim). I have looked at using virtio semantics in UML in the past around the point when I wanted to make the recvmmsg/sendmmsg vector drivers common in UML and QEMU. It is certainly possible, I went for the native approach at the end though. Hmm. I'm not sure what you mean by either :-) Is there any commonality between the vector drivers? I was looking purely from a network driver perspective. I had two options - either do a direct read/write as it does today or implement the ring/king semantics and read/write from that. I decided to not bother with the latter and read/write directly from/to skbs. I can't see how that'd work without a bus abstraction (like virtio) in qemu? I mean, the kernel driver just calls uml_vector_sendmmsg(), which I'd say belongs more to the 'outside world', but that can't really be done in qemu? Ok, I guess then I see what you mean by 'native' though. Similarly, of course, I can implement arbitrary virt-io devices - just the kernel side doesn't call a function like uml_vector_sendmmsg() directly, but instead the virt-io model, and the model calls the function, which essentially is the same just with a (convenient) abstraction layer. But this leaves the fundamental fact the model code ("vector_user.c" or a similar "virtio_user.c") is still part of the build. I guess what I'm thinking is have something like "virtio_user_rpc.c" that uses some appropriate RPC to interact with the real model. IOW, rather than having all the model-specific logic actually be here (like vector_user.c actually knows how to send network packets over a real socket fd), try to call out to some RPC that contains the real model. Now that I thought about it further, I guess my question boils down to "did anyone ever think about doing RPC for Virt-IO instead of putting the entire device model into the hypervisor/emulator/...". Virtio in general no. UML specifically - yes. I have thought of mapping out all key device calls to RPCs for a few applications. The issue is that it is fairly difficult to make all of this function cleanly without blocking in strange places. You may probably want to look at the UML UBD driver. That is an example of moving out all processing to an external thread and talking to it via a request/response API. While it still expects shared memory and needs access to UML address space the model should be more amenable to replacing various calls with RPCs as you have now left the rest of the kernel to run while you are processing the RPC. It also provides you with RPC completion interrupts, etc as a side effect. So you basically have UML -> Thread -> RPCs -> Model? johannes ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um -- Anton R. Ivanov Cambridgegreys Limited. Registered in England. Company Number 10273661 https://www.cambridgegreys.com/
Re: [Qemu-devel] custom virt-io support (in user-mode-linux)
Hi Anton, > > I'm thinking about adding virt-io support to UML, but the tricky part is > > that while I want to use the virt-io basics (because it's a nice > > interface from the 'inside'), I don't actually want the stock drivers > > that are part of the kernel now (like virtio-net etc.) but rather > > something that integrates with wifi (probably building on hwsim). > I have looked at using virtio semantics in UML in the past around the > point when I wanted to make the recvmmsg/sendmmsg vector drivers common > in UML and QEMU. It is certainly possible, > > I went for the native approach at the end though. Hmm. I'm not sure what you mean by either :-) Is there any commonality between the vector drivers? I can't see how that'd work without a bus abstraction (like virtio) in qemu? I mean, the kernel driver just calls uml_vector_sendmmsg(), which I'd say belongs more to the 'outside world', but that can't really be done in qemu? Ok, I guess then I see what you mean by 'native' though. Similarly, of course, I can implement arbitrary virt-io devices - just the kernel side doesn't call a function like uml_vector_sendmmsg() directly, but instead the virt-io model, and the model calls the function, which essentially is the same just with a (convenient) abstraction layer. But this leaves the fundamental fact the model code ("vector_user.c" or a similar "virtio_user.c") is still part of the build. I guess what I'm thinking is have something like "virtio_user_rpc.c" that uses some appropriate RPC to interact with the real model. IOW, rather than having all the model-specific logic actually be here (like vector_user.c actually knows how to send network packets over a real socket fd), try to call out to some RPC that contains the real model. Now that I thought about it further, I guess my question boils down to "did anyone ever think about doing RPC for Virt-IO instead of putting the entire device model into the hypervisor/emulator/...". johannes
Re: [Qemu-devel] custom virt-io support (in user-mode-linux)
On 22/05/2019 14:02, Johannes Berg wrote: Hi, While my main interest is mostly in UML right now [1] I've CC'ed the qemu and virtualization lists because something similar might actually apply to other types of virtualization. I'm thinking about adding virt-io support to UML, but the tricky part is that while I want to use the virt-io basics (because it's a nice interface from the 'inside'), I don't actually want the stock drivers that are part of the kernel now (like virtio-net etc.) but rather something that integrates with wifi (probably building on hwsim). The 'inside' interfaces aren't really a problem - just have a specific device ID for this, and then write a normal virtio kernel driver for it. The 'outside' interfaces are where my thinking breaks down right now. Looking at lkl, the outside is just all implemented in lkl as code that gets linked to the library, so in UML terms it'd just be extra 'outside' code like the timer handling or other netdev stuff we have today. Looking at qemu, it's of course also implemented there, and then interfaces with the real network, console abstraction, etc. However, like I said above, I really need something very custom and not likely to make it upstream to any project (because what point is that if you cannot connect to the rest of the environment I'm building), so I'm thinking that perhaps it should be possible to write an abstract 'outside' that lets you interact with it really from out-of-process? Perhaps through some kind of shared memory segment? I think that gets tricky with virt-io doing DMA (I think it does?) though, so that part would have to be implemented directly and not out-of-process? But really that's why I'm asking - is there a better way than to just link the device-side virt-io code into the same binary (be it lkl lib, uml binary, qemu binary)? Thanks, johannes [1] Actually, I've considered using qemu, but it doesn't have virtualized time and doesn't seem to support TSC virtualization. I guess I could remove TSC from the guest CPU and add a virtualized HPET, but I've yet to convince myself this works - on UML I made virtual time as a prototype already: https://patchwork.ozlabs.org/patch/1095814/ (though my real goal isn't to just skip time forward when the host goes idle, it's to sync with other simulated components) ___ linux-um mailing list linux...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-um I have looked at using virtio semantics in UML in the past around the point when I wanted to make the recvmmsg/sendmmsg vector drivers common in UML and QEMU. It is certainly possible, I went for the native approach at the end though. -- Anton R. Ivanov https://www.kot-begemot.co.uk/
[Qemu-devel] custom virt-io support (in user-mode-linux)
Hi, While my main interest is mostly in UML right now [1] I've CC'ed the qemu and virtualization lists because something similar might actually apply to other types of virtualization. I'm thinking about adding virt-io support to UML, but the tricky part is that while I want to use the virt-io basics (because it's a nice interface from the 'inside'), I don't actually want the stock drivers that are part of the kernel now (like virtio-net etc.) but rather something that integrates with wifi (probably building on hwsim). The 'inside' interfaces aren't really a problem - just have a specific device ID for this, and then write a normal virtio kernel driver for it. The 'outside' interfaces are where my thinking breaks down right now. Looking at lkl, the outside is just all implemented in lkl as code that gets linked to the library, so in UML terms it'd just be extra 'outside' code like the timer handling or other netdev stuff we have today. Looking at qemu, it's of course also implemented there, and then interfaces with the real network, console abstraction, etc. However, like I said above, I really need something very custom and not likely to make it upstream to any project (because what point is that if you cannot connect to the rest of the environment I'm building), so I'm thinking that perhaps it should be possible to write an abstract 'outside' that lets you interact with it really from out-of-process? Perhaps through some kind of shared memory segment? I think that gets tricky with virt-io doing DMA (I think it does?) though, so that part would have to be implemented directly and not out-of-process? But really that's why I'm asking - is there a better way than to just link the device-side virt-io code into the same binary (be it lkl lib, uml binary, qemu binary)? Thanks, johannes [1] Actually, I've considered using qemu, but it doesn't have virtualized time and doesn't seem to support TSC virtualization. I guess I could remove TSC from the guest CPU and add a virtualized HPET, but I've yet to convince myself this works - on UML I made virtual time as a prototype already: https://patchwork.ozlabs.org/patch/1095814/ (though my real goal isn't to just skip time forward when the host goes idle, it's to sync with other simulated components)