RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
> From: Jorgen S. Hansen [mailto:jhan...@vmware.com] > Sent: Wednesday, September 6, 2017 7:11 AM >> ... > > I'm currently working on NFS over AF_VSOCK and sock_diag support (for > > ss(8) and netstat-like tools). > > > > Multi-transport support is lower priority for me at the moment. I'm > > happy to review patches though. If there is no progress on this by the > > end of the year then I will have time to work on it. > > > > I’ll try to find time to write a more coherent proposal in the coming weeks, > and we can discuss that. > > Jorgen Thank you! Thanks, -- Dexuan
Re: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
> On Aug 31, 2017, at 1:54 PM, Stefan Hajnoczi wrote: > > On Tue, Aug 29, 2017 at 03:37:07PM +, Jorgen S. Hansen wrote: >>> On Aug 29, 2017, at 4:36 AM, Dexuan Cui wrote: >> If we allow multiple host side transports, virtio host side support and >> vmci should be able to coexist regardless of the order of initialization. > > That sounds good to me. > > This means af_vsock.c needs to be aware of CID allocation. Currently the > vhost_vsock.ko driver handles this itself (it keeps a list of CIDs and > checks that they are not used twice). It should be possible to move > that state into af_vsock.c so we have pairs. > Yes, that was my thinking as well. > I'm currently working on NFS over AF_VSOCK and sock_diag support (for > ss(8) and netstat-like tools). > > Multi-transport support is lower priority for me at the moment. I'm > happy to review patches though. If there is no progress on this by the > end of the year then I will have time to work on it. > I’ll try to find time to write a more coherent proposal in the coming weeks, and we can discuss that. > Are either of you are in Prague, Czech Republic on October 25-27 for > Linux Kernel Summit, Open Source Summit Europe, Embedded Linux > Conference Europe, KVM Forum, or MesosCon Europe? No, I won’t be there. Thanks, Jorgen
RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com] > Sent: Thursday, August 31, 2017 4:55 AM > ... > On Tue, Aug 29, 2017 at 03:37:07PM +, Jorgen S. Hansen wrote: > > > On Aug 29, 2017, at 4:36 AM, Dexuan Cui wrote: > > If we allow multiple host side transports, virtio host side support and > > vmci should be able to coexist regardless of the order of initialization. > > That sounds good to me. > > This means af_vsock.c needs to be aware of CID allocation. Currently the > vhost_vsock.ko driver handles this itself (it keeps a list of CIDs and > checks that they are not used twice). It should be possible to move > that state into af_vsock.c so we have pairs. > > I'm currently working on NFS over AF_VSOCK and sock_diag support (for > ss(8) and netstat-like tools). > > Multi-transport support is lower priority for me at the moment. I'm > happy to review patches though. If there is no progress on this by the > end of the year then I will have time to work on it. I understand. Thank you both for sharing the details about the plan! > Are either of you are in Prague, Czech Republic on October 25-27 for > Linux Kernel Summit, Open Source Summit Europe, Embedded Linux > Conference Europe, KVM Forum, or MesosCon Europe? > > Stefan I regret I won't be there this year. Thanks, -- Dexuan
Re: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
On Tue, Aug 29, 2017 at 03:37:07PM +, Jorgen S. Hansen wrote: > > On Aug 29, 2017, at 4:36 AM, Dexuan Cui wrote: > If we allow multiple host side transports, virtio host side support and > vmci should be able to coexist regardless of the order of initialization. That sounds good to me. This means af_vsock.c needs to be aware of CID allocation. Currently the vhost_vsock.ko driver handles this itself (it keeps a list of CIDs and checks that they are not used twice). It should be possible to move that state into af_vsock.c so we have pairs. I'm currently working on NFS over AF_VSOCK and sock_diag support (for ss(8) and netstat-like tools). Multi-transport support is lower priority for me at the moment. I'm happy to review patches though. If there is no progress on this by the end of the year then I will have time to work on it. Are either of you are in Prague, Czech Republic on October 25-27 for Linux Kernel Summit, Open Source Summit Europe, Embedded Linux Conference Europe, KVM Forum, or MesosCon Europe? Stefan
Re: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
> On Aug 29, 2017, at 4:36 AM, Dexuan Cui wrote: > >> From: Dexuan Cui >> Sent: Tuesday, August 22, 2017 21:21 >>> ... >>> ... >>> The only problem here would be the potential for a guest and a host app >> to >>> have a conflict wrt port numbers, even though they would be able to >>> operate fine, if restricted to their appropriate transport. >>> >>> Thanks, >>> Jorgen >> >> Hi Jorgen, Stefan, >> Thank you for the detailed analysis! >> You have a much better understanding than me about the complex >> scenarios. Can you please work out a patch? :-) > > Hi Jorgen, Stefan, > May I know your plan for this? I’d be happy to discuss this now, but I don’t have time to work on the actual implementation in the next couple of weeks. For the guest, we agree that registering a guest transport should be tied to either the hypervisor running the guest, or the existence of the appropriate virtual hardware. My main concern is that our existing software relies on the current behavior of auto-loading the VMCI transport on the host side. So changing that behavior could cause trouble for our existing Linux users. I’m wondering whether it would be possible to support multiple host side transports. Since it is theoretically possible to run multiple hypervisors at the same time, there would even be a use case for this. If the assignment of CIDs is made unique across all transports, a connection initiated by the host side will get directed to the appropriate transport based on the CID. Any traffic coming from a guest will naturally be using the appropriate transport. Host side applications will have to share the port space as well. This would require tracking CIDs as a common resource across all transports, but make CIDs unique VM addresses on a given host. If we allow multiple host side transports, virtio host side support and vmci should be able to coexist regardless of the order of initialization. Thanks, Jorgen
RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
> From: Dexuan Cui > Sent: Tuesday, August 22, 2017 21:21 > > ... > > ... > > The only problem here would be the potential for a guest and a host app > to > > have a conflict wrt port numbers, even though they would be able to > > operate fine, if restricted to their appropriate transport. > > > > Thanks, > > Jorgen > > Hi Jorgen, Stefan, > Thank you for the detailed analysis! > You have a much better understanding than me about the complex > scenarios. Can you please work out a patch? :-) Hi Jorgen, Stefan, May I know your plan for this? > IMO Linux driver of Hyper-V sockets is the simplest case, as we only have > the "to host" option (the host side driver of Hyper-V sockets runs on > Windows kernel and I don't think the other hypervisors emulate > the full Hyper-V VMBus 4.0, which is required to support Hyper-V sockets). > > -- Dexuan Thanks, -- Dexuan
RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
> From: Jorgen S. Hansen [mailto:jhan...@vmware.com] > > On Aug 22, 2017, at 11:54 AM, Stefan Hajnoczi > wrote: > > ... > > We *can* by looking at the destination CID. Please take a look at > > drivers/misc/vmw_vmci/vmci_route.c:vmci_route() to see how VMCI > handles > > nested virt. > > > > It boils down to something like this: > > > > static int vsock_stream_connect(struct socket *sock, struct sockaddr *addr, > > int addr_len, int flags) > > { > > ... > > if (remote_addr.svm_cid == VMADDR_CID_HOST) > > transport = host_transport; > > else > > transport = guest_transport; > > > > It's easy for connect(2) but Jorgen mentioned it's harder for listen(2) > > because the socket would need to listen on both transports. We define > > two new constants VMADDR_CID_LISTEN_FROM_GUEST and > > VMADDR_CID_LISTEN_FROM_HOST for bind(2) so that applications can > decide > > which side to listen on. > > If a socket is bound to VMADDR_CID_HOST, we would consider that socket as > bound to the host side transport, so that would be the same as > VMADDR_CID_LISTEN_FROM_GUEST. For the guest, we have > IOCTL_VM_SOCKETS_GET_LOCAL_CID, so that could be used to get and bind > a socket to the guest transport (VMCI will always return the guest CID as the > local one, if the VMCI driver is used in a guest, and it looks like virtio > will do > the same). We could treat VMADDR_CID_ANY as always being the guest > transport, since that is the use case where you don’t know upfront what > your CID is, if we don’t want to listen on all transports. So we would use the > host transport, if a socket is bound to VMADDR_CID_HOST, or if there is no > guest transport, and in all other cases use the guest transport. However, > having a couple of symbolic names like you suggest certainly makes it more > obvious, and could be used in combination with this. It would be a plus if > existing applications would function as intended in most cases. > > > Or the listen socket could simply listen to > > both sides. > > The only problem here would be the potential for a guest and a host app to > have a conflict wrt port numbers, even though they would be able to > operate fine, if restricted to their appropriate transport. > > Thanks, > Jorgen Hi Jorgen, Stefan, Thank you for the detailed analysis! You have a much better understanding than me about the complex scenarios. Can you please work out a patch? :-) IMO Linux driver of Hyper-V sockets is the simplest case, as we only have the "to host" option (the host side driver of Hyper-V sockets runs on Windows kernel and I don't think the other hypervisors emulate the full Hyper-V VMBus 4.0, which is required to support Hyper-V sockets). -- Dexuan
Re: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
> On Aug 22, 2017, at 11:54 AM, Stefan Hajnoczi wrote: > > On Fri, Aug 18, 2017 at 11:07:37PM +, Dexuan Cui wrote: >>> Not all features need to be supported. For example, VMCI supports >>> SOCK_DGRAM while Hyper-V and virtio do not. But features that are >>> available should behave identically. >> I totally agree, though I'm afraid Hyper-V may have a little more limitations >> compared to VMware/KVM duo to the <--> >> mapping. >> Can we use the 'protocol' parameter in the socket() function: int socket(int domain, int type, int protocol) IMO currently the 'protocol' is not really used. I think we can modify __vsock_core_init() to allow multiple transport layers >>> to be registered, and we can define different 'protocol' numbers for VMware/KVM/Hyper-V, and ask the application to explicitly specify what >>> should be used. Considering compatibility, we can use the default transport in a >>> given VM depending on the underlying hypervisor. >>> >>> I think AF_VSOCK should hide the transport from users/applications. >> Ideally yes, but let's consider the KVM-on-KVM nested scenario: when >> an application in the Level-1 VM creates an AF_VSOCK socket and call >> connect() for it, how can we know if the app is trying to connect to >> the Level-0 host, or connect to the Level-2 VM? We can't. > > We *can* by looking at the destination CID. Please take a look at > drivers/misc/vmw_vmci/vmci_route.c:vmci_route() to see how VMCI handles > nested virt. > > It boils down to something like this: > > static int vsock_stream_connect(struct socket *sock, struct sockaddr *addr, > int addr_len, int flags) > { > ... > if (remote_addr.svm_cid == VMADDR_CID_HOST) > transport = host_transport; > else > transport = guest_transport; > > It's easy for connect(2) but Jorgen mentioned it's harder for listen(2) > because the socket would need to listen on both transports. We define > two new constants VMADDR_CID_LISTEN_FROM_GUEST and > VMADDR_CID_LISTEN_FROM_HOST for bind(2) so that applications can decide > which side to listen on. If a socket is bound to VMADDR_CID_HOST, we would consider that socket as bound to the host side transport, so that would be the same as VMADDR_CID_LISTEN_FROM_GUEST. For the guest, we have IOCTL_VM_SOCKETS_GET_LOCAL_CID, so that could be used to get and bind a socket to the guest transport (VMCI will always return the guest CID as the local one, if the VMCI driver is used in a guest, and it looks like virtio will do the same). We could treat VMADDR_CID_ANY as always being the guest transport, since that is the use case where you don’t know upfront what your CID is, if we don’t want to listen on all transports. So we would use the host transport, if a socket is bound to VMADDR_CID_HOST, or if there is no guest transport, and in all other cases use the guest transport. However, having a couple of symbolic names like you suggest certainly makes it more obvious, and could be used in combination with this. It would be a plus if existing applications would function as intended in most cases. > Or the listen socket could simply listen to > both sides. The only problem here would be the potential for a guest and a host app to have a conflict wrt port numbers, even though they would be able to operate fine, if restricted to their appropriate transport. Thanks, Jorgen
Re: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
On Fri, Aug 18, 2017 at 11:07:37PM +, Dexuan Cui wrote: > > From: Stefan Hajnoczi [mailto:stefa...@redhat.com] > > > CID is not really used by us, because we only support guest<->host > > communication, > > > and don't support guest<->guest communication. The Hyper-V host > > references > > > every VM by VmID (which is invisible to the VM), and a VM can only talk to > > the > > > host via this feature. > > > > Applications running inside the guest should use VMADDR_CID_HOST (2) to > > connect to the host, even on Hyper-V. > I have no objection, and this patch does support this usage of the > user-space applications. > > > By the way, we should collaborate on a test suite and a vsock(7) man > > page that documents the semantics of AF_VSOCK sockets. This way our > > transports will have the same behavior and AF_VSOCK applications will > > work on all 3 hypervisors. > I can't agree more. :-) > BTW, I have been using Rolf's test suite to test my patch: > https://github.com/rn/virtsock/tree/master/c > Maybe this can be a good starting point. Thanks for sharing this, I will try it with virtio-vsock. I have a netcat-like utility here: https://github.com/stefanha/linux/blob/vsock-extras/nc-vsock.c > > Not all features need to be supported. For example, VMCI supports > > SOCK_DGRAM while Hyper-V and virtio do not. But features that are > > available should behave identically. > I totally agree, though I'm afraid Hyper-V may have a little more limitations > compared to VMware/KVM duo to the <--> > mapping. > > > > Can we use the 'protocol' parameter in the socket() function: > > > int socket(int domain, int type, int protocol) > > > > > > IMO currently the 'protocol' is not really used. > > > I think we can modify __vsock_core_init() to allow multiple transport > > > layers > > to > > > be registered, and we can define different 'protocol' numbers for > > > VMware/KVM/Hyper-V, and ask the application to explicitly specify what > > should > > > be used. Considering compatibility, we can use the default transport in a > > given > > > VM depending on the underlying hypervisor. > > > > I think AF_VSOCK should hide the transport from users/applications. > Ideally yes, but let's consider the KVM-on-KVM nested scenario: when > an application in the Level-1 VM creates an AF_VSOCK socket and call > connect() for it, how can we know if the app is trying to connect to > the Level-0 host, or connect to the Level-2 VM? We can't. We *can* by looking at the destination CID. Please take a look at drivers/misc/vmw_vmci/vmci_route.c:vmci_route() to see how VMCI handles nested virt. It boils down to something like this: static int vsock_stream_connect(struct socket *sock, struct sockaddr *addr, int addr_len, int flags) { ... if (remote_addr.svm_cid == VMADDR_CID_HOST) transport = host_transport; else transport = guest_transport; It's easy for connect(2) but Jorgen mentioned it's harder for listen(2) because the socket would need to listen on both transports. We define two new constants VMADDR_CID_LISTEN_FROM_GUEST and VMADDR_CID_LISTEN_FROM_HOST for bind(2) so that applications can decide which side to listen on. Or the listen socket could simply listen to both sides. Stefan
RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
> From: Stefan Hajnoczi [mailto:stefa...@redhat.com] > > CID is not really used by us, because we only support guest<->host > communication, > > and don't support guest<->guest communication. The Hyper-V host > references > > every VM by VmID (which is invisible to the VM), and a VM can only talk to > the > > host via this feature. > > Applications running inside the guest should use VMADDR_CID_HOST (2) to > connect to the host, even on Hyper-V. I have no objection, and this patch does support this usage of the user-space applications. > By the way, we should collaborate on a test suite and a vsock(7) man > page that documents the semantics of AF_VSOCK sockets. This way our > transports will have the same behavior and AF_VSOCK applications will > work on all 3 hypervisors. I can't agree more. :-) BTW, I have been using Rolf's test suite to test my patch: https://github.com/rn/virtsock/tree/master/c Maybe this can be a good starting point. > Not all features need to be supported. For example, VMCI supports > SOCK_DGRAM while Hyper-V and virtio do not. But features that are > available should behave identically. I totally agree, though I'm afraid Hyper-V may have a little more limitations compared to VMware/KVM duo to the <--> mapping. > > Can we use the 'protocol' parameter in the socket() function: > > int socket(int domain, int type, int protocol) > > > > IMO currently the 'protocol' is not really used. > > I think we can modify __vsock_core_init() to allow multiple transport layers > to > > be registered, and we can define different 'protocol' numbers for > > VMware/KVM/Hyper-V, and ask the application to explicitly specify what > should > > be used. Considering compatibility, we can use the default transport in a > given > > VM depending on the underlying hypervisor. > > I think AF_VSOCK should hide the transport from users/applications. Ideally yes, but let's consider the KVM-on-KVM nested scenario: when an application in the Level-1 VM creates an AF_VSOCK socket and call connect() for it, how can we know if the app is trying to connect to the Level-0 host, or connect to the Level-2 VM? We can't. That's why I propose we should use the 'protocol' parameter to distinguish between "to guest" and "to host". With my proposal, in the above scenario, by default (the 'protocol' is 0), we choose the "to host" transport layer when socket() is called; if the userspace app explicitly specifies "to guest", we choose the "to guest" transport layer when socket() is called. This way, the connect(), bind(), etc. can work automatically. (Of course, the default transport for a give VM can be better chosen if we detect which nested level the app is running on.) > Think of same-on-same nested virtualization: VMware-on-VMware or > KVM-on-KVM. In that case specifying VMCI or virtio doesn't help. > > We'd still need to distinguish between "to guest" and "to host" > (currently VMCI has code to do this but virtio does not). > > The natural place to distinguish the destination is when dealing with > the sockaddr in connect(), bind(), etc. > > Stefan Thanks, -- Dexuan
Re: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
On Fri, Aug 18, 2017 at 03:07:30AM +, Dexuan Cui wrote: > > From: Jorgen S. Hansen [mailto:jhan...@vmware.com] > > Sent: Thursday, August 17, 2017 08:17 > > > > > > Putting aside nested virtualization, I want to load the transport (vmci, > > > Hyper-V, vsock) for which there is paravirtualized hardware present > > > inside the guest. > > > > Good points. Completely agree that this is the desired behavior for a guest. > > > > > > > It's a little tricker on the host side (doesn't matter for Hyper-V and > > > probably also doesn't for VMware) because the host-side driver is a > > > software device with no hardware backing it. In KVM we assume the > > > vhost_vsock.ko kernel module will be loaded sufficiently early. > > > > Since the vmci driver is currently tied to PF_VSOCK it hasn’t been a > > problem, > > but on the host side the VMCI driver has no hardware backing it either, so > > when we move to a more appropriate solution, this will be an issue for VMCI > > as > > well. I’ll check our shipped products, but they most likely assume that if > > an > > upstreamed vmci module is present, it will be loaded automatically. > > Hyper-V Sockets is a standard feature of VMBus v4.0, so we can easily know > we can and should load iff vmbus_proto_version >= VERSION_WIN10. > > > > Things get trickier with nested virtualization because the VM might want > > > to talk to its host but also to its nested VMs. The simple way of > > > fixing this would be to allow two transports loaded simultaneously and > > > route traffic destined to CID 2 to the host transport and all other > > > traffic to the guest transport. > > This sounds like a little tricky to me. > CID is not really used by us, because we only support guest<->host > communication, > and don't support guest<->guest communication. The Hyper-V host references > every VM by VmID (which is invisible to the VM), and a VM can only talk to the > host via this feature. Applications running inside the guest should use VMADDR_CID_HOST (2) to connect to the host, even on Hyper-V. By the way, we should collaborate on a test suite and a vsock(7) man page that documents the semantics of AF_VSOCK sockets. This way our transports will have the same behavior and AF_VSOCK applications will work on all 3 hypervisors. Not all features need to be supported. For example, VMCI supports SOCK_DGRAM while Hyper-V and virtio do not. But features that are available should behave identically. > > This is close to the routing the VMCI driver does in a nested environment, > > but > > that is with the assumption that there is only one type of transport. > > Having two > > different transports would require that we delay resolving the transport > > type > > until the socket endpoint has been bound to an address. Things get trickier > > if > > listening sockets use VMADDR_CID_ANY - if only one transport is present, > > this > > would allow the socket to accept connections from both guests and outer > > host, > > but with multiple transports that won’t work, since we can’t associate a > > socket > > with a transport until the socket is bound. > > > > > > > > Perhaps we should discuss these cases a bit more to figure out how to > > > avoid conflicts over MODULE_ALIAS_NETPROTO(PF_VSOCK). > > > > Agreed. > > Can we use the 'protocol' parameter in the socket() function: > int socket(int domain, int type, int protocol) > > IMO currently the 'protocol' is not really used. > I think we can modify __vsock_core_init() to allow multiple transport layers > to > be registered, and we can define different 'protocol' numbers for > VMware/KVM/Hyper-V, and ask the application to explicitly specify what should > be used. Considering compatibility, we can use the default transport in a > given > VM depending on the underlying hypervisor. I think AF_VSOCK should hide the transport from users/applications. Think of same-on-same nested virtualization: VMware-on-VMware or KVM-on-KVM. In that case specifying VMCI or virtio doesn't help. We'd still need to distinguish between "to guest" and "to host" (currently VMCI has code to do this but virtio does not). The natural place to distinguish the destination is when dealing with the sockaddr in connect(), bind(), etc. Stefan signature.asc Description: PGP signature
RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
> From: Jorgen S. Hansen [mailto:jhan...@vmware.com] > Sent: Thursday, August 17, 2017 08:17 > > > > Putting aside nested virtualization, I want to load the transport (vmci, > > Hyper-V, vsock) for which there is paravirtualized hardware present > > inside the guest. > > Good points. Completely agree that this is the desired behavior for a guest. > > > > It's a little tricker on the host side (doesn't matter for Hyper-V and > > probably also doesn't for VMware) because the host-side driver is a > > software device with no hardware backing it. In KVM we assume the > > vhost_vsock.ko kernel module will be loaded sufficiently early. > > Since the vmci driver is currently tied to PF_VSOCK it hasn’t been a problem, > but on the host side the VMCI driver has no hardware backing it either, so > when we move to a more appropriate solution, this will be an issue for VMCI as > well. I’ll check our shipped products, but they most likely assume that if an > upstreamed vmci module is present, it will be loaded automatically. Hyper-V Sockets is a standard feature of VMBus v4.0, so we can easily know we can and should load iff vmbus_proto_version >= VERSION_WIN10. > > Things get trickier with nested virtualization because the VM might want > > to talk to its host but also to its nested VMs. The simple way of > > fixing this would be to allow two transports loaded simultaneously and > > route traffic destined to CID 2 to the host transport and all other > > traffic to the guest transport. This sounds like a little tricky to me. CID is not really used by us, because we only support guest<->host communication, and don't support guest<->guest communication. The Hyper-V host references every VM by VmID (which is invisible to the VM), and a VM can only talk to the host via this feature. > This is close to the routing the VMCI driver does in a nested environment, but > that is with the assumption that there is only one type of transport. Having > two > different transports would require that we delay resolving the transport type > until the socket endpoint has been bound to an address. Things get trickier if > listening sockets use VMADDR_CID_ANY - if only one transport is present, this > would allow the socket to accept connections from both guests and outer host, > but with multiple transports that won’t work, since we can’t associate a > socket > with a transport until the socket is bound. > > > > > Perhaps we should discuss these cases a bit more to figure out how to > > avoid conflicts over MODULE_ALIAS_NETPROTO(PF_VSOCK). > > Agreed. Can we use the 'protocol' parameter in the socket() function: int socket(int domain, int type, int protocol) IMO currently the 'protocol' is not really used. I think we can modify __vsock_core_init() to allow multiple transport layers to be registered, and we can define different 'protocol' numbers for VMware/KVM/Hyper-V, and ask the application to explicitly specify what should be used. Considering compatibility, we can use the default transport in a given VM depending on the underlying hypervisor. -- Dexuan
RE: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
> From: David Miller [mailto:da...@davemloft.net] > Sent: Thursday, August 17, 2017 10:04 > I would avoid module parameters at all costs. > > It is the worst possible interface for users of your software. > > You really need to fundamentally solve the problems related to making > sure the proper modules for the VM actually present on the system get > loaded when necessary rather than adding hacks like this. > > Unlike a proper solution, these hacks are ugly but have to stay around > forever once you put them in place. Sorry for reminding me again, David! :-) I'll try to figure out the correct solution. Thanks, -- Dexuan
Re: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
From: Dexuan Cui Date: Thu, 17 Aug 2017 08:00:29 + > @@ -73,6 +74,10 @@ struct vmci_transport_recv_pkt_info { > struct vmci_transport_packet pkt; > }; > > +static bool skip_hypervisor_check; > +module_param(skip_hypervisor_check, bool, 0444); > +MODULE_PARM_DESC(hot_add, "If set, attempt to load on non-VMware platforms"); > + I would avoid module parameters at all costs. It is the worst possible interface for users of your software. You really need to fundamentally solve the problems related to making sure the proper modules for the VM actually present on the system get loaded when necessary rather than adding hacks like this. Unlike a proper solution, these hacks are ugly but have to stay around forever once you put them in place.
Re: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
> On Aug 17, 2017, at 3:55 PM, Stefan Hajnoczi wrote: > > On Thu, Aug 17, 2017 at 08:00:29AM +, Dexuan Cui wrote: >> >> Without the patch, vmw_vsock_vmci_transport.ko can automatically load >> when an application creates an AF_VSOCK socket. >> >> This is the expected good behavior on VMware hypervisor, but as we >> are going to add hv_sock.ko (i.e. Hyper-V transport for AF_VSOCK), we >> should make sure vmw_vsock_vmci_transport.ko can't load on Hyper-V, >> otherwise there is a -EBUSY conflict when both vmw_vsock_vmci_transport.ko >> and hv_sock.ko try to call vsock_core_init() on Hyper-V. >> >> On the other hand, hv_sock.ko can only load on Hyper-V, because it >> depends on hv_vmbus.ko, which detects Hyper-V in hv_acpi_init(). >> >> KVM's vsock_virtio_transport doesn't have the issue because it doesn't >> define MODULE_ALIAS_NETPROTO(PF_VSOCK). > > Thanks for sending this patch, vmci's MODULE_ALIAS_NETPROTO(PF_VSOCK) is > a problem for vhost_vsock.ko (the virtio host driver) too. A host > userspace program can create a AF_VSOCK socket before vhost_vsock is > loaded. The vmci transport will be unconditionally loaded and that's > not the right behavior. > > Putting aside nested virtualization, I want to load the transport (vmci, > Hyper-V, vsock) for which there is paravirtualized hardware present > inside the guest. Good points. Completely agree that this is the desired behavior for a guest. > It's a little tricker on the host side (doesn't matter for Hyper-V and > probably also doesn't for VMware) because the host-side driver is a > software device with no hardware backing it. In KVM we assume the > vhost_vsock.ko kernel module will be loaded sufficiently early. Since the vmci driver is currently tied to PF_VSOCK it hasn’t been a problem, but on the host side the VMCI driver has no hardware backing it either, so when we move to a more appropriate solution, this will be an issue for VMCI as well. I’ll check our shipped products, but they most likely assume that if an upstreamed vmci module is present, it will be loaded automatically. > > Things get trickier with nested virtualization because the VM might want > to talk to its host but also to its nested VMs. The simple way of > fixing this would be to allow two transports loaded simultaneously and > route traffic destined to CID 2 to the host transport and all other > traffic to the guest transport. This is close to the routing the VMCI driver does in a nested environment, but that is with the assumption that there is only one type of transport. Having two different transports would require that we delay resolving the transport type until the socket endpoint has been bound to an address. Things get trickier if listening sockets use VMADDR_CID_ANY - if only one transport is present, this would allow the socket to accept connections from both guests and outer host, but with multiple transports that won’t work, since we can’t associate a socket with a transport until the socket is bound. > > Perhaps we should discuss these cases a bit more to figure out how to > avoid conflicts over MODULE_ALIAS_NETPROTO(PF_VSOCK). Agreed. > >> >> The patch also adds a module parameter "skip_hypervisor_check" for >> vmw_vsock_vmci_transport.ko. >> >> Signed-off-by: Dexuan Cui >> Cc: Alok Kataria >> Cc: Andy King >> Cc: Adit Ranadive >> Cc: George Zhang >> Cc: Jorgen Hansen >> Cc: K. Y. Srinivasan >> Cc: Haiyang Zhang >> Cc: Stephen Hemminger >> --- >> net/vmw_vsock/Kconfig | 2 +- >> net/vmw_vsock/vmci_transport.c | 11 +++ >> 2 files changed, 12 insertions(+), 1 deletion(-) >> >> diff --git a/net/vmw_vsock/Kconfig b/net/vmw_vsock/Kconfig >> index a24369d..3f52929 100644 >> --- a/net/vmw_vsock/Kconfig >> +++ b/net/vmw_vsock/Kconfig >> @@ -17,7 +17,7 @@ config VSOCKETS >> >> config VMWARE_VMCI_VSOCKETS >> tristate "VMware VMCI transport for Virtual Sockets" >> -depends on VSOCKETS && VMWARE_VMCI >> +depends on VSOCKETS && VMWARE_VMCI && HYPERVISOR_GUEST >> help >>This module implements a VMCI transport for Virtual Sockets. >> >> diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c >> index 10ae782..c068873 100644 >> --- a/net/vmw_vsock/vmci_transport.c >> +++ b/net/vmw_vsock/vmci_transport.c >> @@ -16,6 +16,7 @@ >> #include >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -73,6 +74,10 @@ struct vmci_transport_recv_pkt_info { >> struct vmci_transport_packet pkt; >> }; >> >> +static bool skip_hypervisor_check; >> +module_param(skip_hypervisor_check, bool, 0444); >> +MODULE_PARM_DESC(hot_add, "If set, attempt to load on non-VMware >> platforms"); >> + >> static LIST_HEAD(vmci_transport_cleanup_list); >> static DEFINE_SPINLOCK(vmci_transport_cleanup_lock); >> static DECLARE_WORK(vmci_transport_cleanup_work, vmci_transport_cleanup); >> @@ -2085,6 +2090,12 @@ static int __init vmci_transport_init(void) >> { >> int er
Re: [PATCH] vsock: only load vmci transport on VMware hypervisor by default
On Thu, Aug 17, 2017 at 08:00:29AM +, Dexuan Cui wrote: > > Without the patch, vmw_vsock_vmci_transport.ko can automatically load > when an application creates an AF_VSOCK socket. > > This is the expected good behavior on VMware hypervisor, but as we > are going to add hv_sock.ko (i.e. Hyper-V transport for AF_VSOCK), we > should make sure vmw_vsock_vmci_transport.ko can't load on Hyper-V, > otherwise there is a -EBUSY conflict when both vmw_vsock_vmci_transport.ko > and hv_sock.ko try to call vsock_core_init() on Hyper-V. > > On the other hand, hv_sock.ko can only load on Hyper-V, because it > depends on hv_vmbus.ko, which detects Hyper-V in hv_acpi_init(). > > KVM's vsock_virtio_transport doesn't have the issue because it doesn't > define MODULE_ALIAS_NETPROTO(PF_VSOCK). Thanks for sending this patch, vmci's MODULE_ALIAS_NETPROTO(PF_VSOCK) is a problem for vhost_vsock.ko (the virtio host driver) too. A host userspace program can create a AF_VSOCK socket before vhost_vsock is loaded. The vmci transport will be unconditionally loaded and that's not the right behavior. Putting aside nested virtualization, I want to load the transport (vmci, Hyper-V, vsock) for which there is paravirtualized hardware present inside the guest. It's a little tricker on the host side (doesn't matter for Hyper-V and probably also doesn't for VMware) because the host-side driver is a software device with no hardware backing it. In KVM we assume the vhost_vsock.ko kernel module will be loaded sufficiently early. Things get trickier with nested virtualization because the VM might want to talk to its host but also to its nested VMs. The simple way of fixing this would be to allow two transports loaded simultaneously and route traffic destined to CID 2 to the host transport and all other traffic to the guest transport. Perhaps we should discuss these cases a bit more to figure out how to avoid conflicts over MODULE_ALIAS_NETPROTO(PF_VSOCK). > > The patch also adds a module parameter "skip_hypervisor_check" for > vmw_vsock_vmci_transport.ko. > > Signed-off-by: Dexuan Cui > Cc: Alok Kataria > Cc: Andy King > Cc: Adit Ranadive > Cc: George Zhang > Cc: Jorgen Hansen > Cc: K. Y. Srinivasan > Cc: Haiyang Zhang > Cc: Stephen Hemminger > --- > net/vmw_vsock/Kconfig | 2 +- > net/vmw_vsock/vmci_transport.c | 11 +++ > 2 files changed, 12 insertions(+), 1 deletion(-) > > diff --git a/net/vmw_vsock/Kconfig b/net/vmw_vsock/Kconfig > index a24369d..3f52929 100644 > --- a/net/vmw_vsock/Kconfig > +++ b/net/vmw_vsock/Kconfig > @@ -17,7 +17,7 @@ config VSOCKETS > > config VMWARE_VMCI_VSOCKETS > tristate "VMware VMCI transport for Virtual Sockets" > - depends on VSOCKETS && VMWARE_VMCI > + depends on VSOCKETS && VMWARE_VMCI && HYPERVISOR_GUEST > help > This module implements a VMCI transport for Virtual Sockets. > > diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c > index 10ae782..c068873 100644 > --- a/net/vmw_vsock/vmci_transport.c > +++ b/net/vmw_vsock/vmci_transport.c > @@ -16,6 +16,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -73,6 +74,10 @@ struct vmci_transport_recv_pkt_info { > struct vmci_transport_packet pkt; > }; > > +static bool skip_hypervisor_check; > +module_param(skip_hypervisor_check, bool, 0444); > +MODULE_PARM_DESC(hot_add, "If set, attempt to load on non-VMware platforms"); > + > static LIST_HEAD(vmci_transport_cleanup_list); > static DEFINE_SPINLOCK(vmci_transport_cleanup_lock); > static DECLARE_WORK(vmci_transport_cleanup_work, vmci_transport_cleanup); > @@ -2085,6 +2090,12 @@ static int __init vmci_transport_init(void) > { > int err; > > + /* Check if we are running on VMware's hypervisor and bail out > + * if we are not. > + */ > + if (!skip_hypervisor_check && x86_hyper != &x86_hyper_vmware) > + return -ENODEV; > + > /* Create the datagram handle that we will use to send and receive all >* VSocket control messages for this context. >*/ > -- > 2.7.4 > signature.asc Description: PGP signature
[PATCH] vsock: only load vmci transport on VMware hypervisor by default
Without the patch, vmw_vsock_vmci_transport.ko can automatically load when an application creates an AF_VSOCK socket. This is the expected good behavior on VMware hypervisor, but as we are going to add hv_sock.ko (i.e. Hyper-V transport for AF_VSOCK), we should make sure vmw_vsock_vmci_transport.ko can't load on Hyper-V, otherwise there is a -EBUSY conflict when both vmw_vsock_vmci_transport.ko and hv_sock.ko try to call vsock_core_init() on Hyper-V. On the other hand, hv_sock.ko can only load on Hyper-V, because it depends on hv_vmbus.ko, which detects Hyper-V in hv_acpi_init(). KVM's vsock_virtio_transport doesn't have the issue because it doesn't define MODULE_ALIAS_NETPROTO(PF_VSOCK). The patch also adds a module parameter "skip_hypervisor_check" for vmw_vsock_vmci_transport.ko. Signed-off-by: Dexuan Cui Cc: Alok Kataria Cc: Andy King Cc: Adit Ranadive Cc: George Zhang Cc: Jorgen Hansen Cc: K. Y. Srinivasan Cc: Haiyang Zhang Cc: Stephen Hemminger --- net/vmw_vsock/Kconfig | 2 +- net/vmw_vsock/vmci_transport.c | 11 +++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/net/vmw_vsock/Kconfig b/net/vmw_vsock/Kconfig index a24369d..3f52929 100644 --- a/net/vmw_vsock/Kconfig +++ b/net/vmw_vsock/Kconfig @@ -17,7 +17,7 @@ config VSOCKETS config VMWARE_VMCI_VSOCKETS tristate "VMware VMCI transport for Virtual Sockets" - depends on VSOCKETS && VMWARE_VMCI + depends on VSOCKETS && VMWARE_VMCI && HYPERVISOR_GUEST help This module implements a VMCI transport for Virtual Sockets. diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c index 10ae782..c068873 100644 --- a/net/vmw_vsock/vmci_transport.c +++ b/net/vmw_vsock/vmci_transport.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include #include @@ -73,6 +74,10 @@ struct vmci_transport_recv_pkt_info { struct vmci_transport_packet pkt; }; +static bool skip_hypervisor_check; +module_param(skip_hypervisor_check, bool, 0444); +MODULE_PARM_DESC(hot_add, "If set, attempt to load on non-VMware platforms"); + static LIST_HEAD(vmci_transport_cleanup_list); static DEFINE_SPINLOCK(vmci_transport_cleanup_lock); static DECLARE_WORK(vmci_transport_cleanup_work, vmci_transport_cleanup); @@ -2085,6 +2090,12 @@ static int __init vmci_transport_init(void) { int err; + /* Check if we are running on VMware's hypervisor and bail out +* if we are not. +*/ + if (!skip_hypervisor_check && x86_hyper != &x86_hyper_vmware) + return -ENODEV; + /* Create the datagram handle that we will use to send and receive all * VSocket control messages for this context. */ -- 2.7.4