Re: RFC: New device for zero-copy VM memory access
On 2019-11-04 22:55, Dr. David Alan Gilbert wrote: * ge...@hostfission.com (ge...@hostfission.com) wrote: On 2019-11-03 21:10, ge...@hostfission.com wrote: > On 2019-11-01 02:52, Dr. David Alan Gilbert wrote: > > * ge...@hostfission.com (ge...@hostfission.com) wrote: > > > > > > > > > On 2019-11-01 01:52, Peter Maydell wrote: > > > > On Thu, 31 Oct 2019 at 14:26, wrote: > > > > > As the author of Looking Glass, I also have to consider the > > > > > maintenance > > > > > and the complexity of implementing the vhost protocol into the > > > > > project. > > > > > At this time a complete Porthole client can be implemented in 150 > > > > > lines > > > > > of C without external dependencies, and most of that is boilerplate > > > > > socket code. This IMO is a major factor in deciding to avoid > > > > > vhost-user. > > > > > > > > This is essentially a proposal that we should make our project and > > > > code more complicated so that your project and code can be simpler. > > > > I hope you can see why this isn't necessarily an argument that will hold > > > > very much weight for us :-) > > > > > > Certainly, I do which is why I am still going to see about using > > > vhost, > > > however, a device that uses vhost is likely more complex then > > > the device > > > as it stands right now and as such more maintenance would be > > > involved on > > > your end also. Or have I missed something in that vhost-user can > > > be used > > > directly as a device? > > > > The basic vhost-user stuff isn't actually that hard; if you aren't > > actually shuffling commands over the queues you should find it pretty > > simple - so I think your assumption about it being simpler if you > > avoid > > it might be wrong. It might be easier if you use it! > > I have been looking into this and I am yet to find some decent > documentation or a simple device example I can use to understand how to > create such a device. Do you know of any reading or examples I can > obtain > on how to get an initial do nothing device up and running? > > -Geoff Scratch that, the design just solidified for me and I am now making progress, however it seems that vhost-user can't do what we need here: 1) I dont see any way to recieve notification of socket disconnection, in our use case the client app needs to be able to be (re)connected dynamically. It might be possible to get this event by registering it on the chardev manually but this seems like it would be a kludge. My understanding was that someone added support for reconnection of vhost-user; I'm not sure of the detail - cc'ing in Maxime and Marc-Andre. 2) I don't see any method of notifying the vhost-user client of the removal of a shared memory mapping. Again, these may not be persistently mapped in the guest as we have no control over the buffer allocation, and as such, we need a method to notify the client that the mapping has become invalid. 3) VHOST_USER_SET_MEM_TABLE is a one time request, again this breaks our usage as we need to change this dynamically at runtime. I've seen (3) being sent multiple times (It's messy but it happens); so I think that fixes (2) as well for you. Yes, but it's ignored. /* * For non-vring specific requests, like VHOST_USER_SET_MEM_TABLE, * we just need send it once in the first time. For later such * request, we just ignore it. */ if (vhost_user_one_time_request(msg->hdr.request) && dev->vq_index != 0) { msg->hdr.flags &= ~VHOST_USER_NEED_REPLY_MASK; return 0; } Dave Unless there are viable solutions to these problems there is no way that vhost-user can be used for this kind of a device. -Geoff > > > > > Dave > > > > > > > > > > thanks > > > > -- PMM > > -- > > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: RFC: New device for zero-copy VM memory access
On 2019-11-04 21:26, Gerd Hoffmann wrote: Hi, This new device, currently named `introspection` (which needs a more suitable name, porthole perhaps?), provides a means of translating guest physical addresses to host virtual addresses, and finally to the host offsets in RAM for file-backed memory guests. It does this by means of a simple protocol over a unix socket (chardev) which is supplied the appropriate fd for the VM's system RAM. The guest (in this case, Windows), when presented with the address of a userspace buffer and size, will mlock the appropriate pages into RAM and pass guest physical addresses to the virtual device. So, if I understand things correctly, the workflow looks like this: (1) guest allocates buffers, using guest ram. (2) guest uses these buffers as render target for the gpu (pci-assigned I guess?). (3) guest passes guest physical address to qemu (via porthole device). (4) qemu translates gpa into file offset and passes offsets to the client application. (5) client application maps all guest ram, then uses the offsets from qemu to find the buffers. Then goes displaying these buffers I guess. Correct? Correct, however step 5 might be a proxy to copy the buffers into another porthole device in a second VM allowing VM->VM transfers. Performance aside for now, is it an option for your use case to simply use both an emulated display device and the assigned gpu, then configure screen mirroring inside the guest to get the guest display scanned out to the host? Unfortunately no, NVidia and AMD devices do not support mirroring their outputs to a separate GPU unless it's a professional-grade GPU such as a Quadro or Firepro. -Geoff cheers, Gerd
Re: RFC: New device for zero-copy VM memory access
On 2019-11-03 21:10, ge...@hostfission.com wrote: On 2019-11-01 02:52, Dr. David Alan Gilbert wrote: * ge...@hostfission.com (ge...@hostfission.com) wrote: On 2019-11-01 01:52, Peter Maydell wrote: > On Thu, 31 Oct 2019 at 14:26, wrote: > > As the author of Looking Glass, I also have to consider the > > maintenance > > and the complexity of implementing the vhost protocol into the > > project. > > At this time a complete Porthole client can be implemented in 150 > > lines > > of C without external dependencies, and most of that is boilerplate > > socket code. This IMO is a major factor in deciding to avoid > > vhost-user. > > This is essentially a proposal that we should make our project and > code more complicated so that your project and code can be simpler. > I hope you can see why this isn't necessarily an argument that will hold > very much weight for us :-) Certainly, I do which is why I am still going to see about using vhost, however, a device that uses vhost is likely more complex then the device as it stands right now and as such more maintenance would be involved on your end also. Or have I missed something in that vhost-user can be used directly as a device? The basic vhost-user stuff isn't actually that hard; if you aren't actually shuffling commands over the queues you should find it pretty simple - so I think your assumption about it being simpler if you avoid it might be wrong. It might be easier if you use it! I have been looking into this and I am yet to find some decent documentation or a simple device example I can use to understand how to create such a device. Do you know of any reading or examples I can obtain on how to get an initial do nothing device up and running? -Geoff Scratch that, the design just solidified for me and I am now making progress, however it seems that vhost-user can't do what we need here: 1) I dont see any way to recieve notification of socket disconnection, in our use case the client app needs to be able to be (re)connected dynamically. It might be possible to get this event by registering it on the chardev manually but this seems like it would be a kludge. 2) I don't see any method of notifying the vhost-user client of the removal of a shared memory mapping. Again, these may not be persistently mapped in the guest as we have no control over the buffer allocation, and as such, we need a method to notify the client that the mapping has become invalid. 3) VHOST_USER_SET_MEM_TABLE is a one time request, again this breaks our usage as we need to change this dynamically at runtime. Unless there are viable solutions to these problems there is no way that vhost-user can be used for this kind of a device. -Geoff Dave > > thanks > -- PMM -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: RFC: New device for zero-copy VM memory access
On 2019-11-01 02:52, Dr. David Alan Gilbert wrote: * ge...@hostfission.com (ge...@hostfission.com) wrote: On 2019-11-01 01:52, Peter Maydell wrote: > On Thu, 31 Oct 2019 at 14:26, wrote: > > As the author of Looking Glass, I also have to consider the > > maintenance > > and the complexity of implementing the vhost protocol into the > > project. > > At this time a complete Porthole client can be implemented in 150 > > lines > > of C without external dependencies, and most of that is boilerplate > > socket code. This IMO is a major factor in deciding to avoid > > vhost-user. > > This is essentially a proposal that we should make our project and > code more complicated so that your project and code can be simpler. > I hope you can see why this isn't necessarily an argument that will hold > very much weight for us :-) Certainly, I do which is why I am still going to see about using vhost, however, a device that uses vhost is likely more complex then the device as it stands right now and as such more maintenance would be involved on your end also. Or have I missed something in that vhost-user can be used directly as a device? The basic vhost-user stuff isn't actually that hard; if you aren't actually shuffling commands over the queues you should find it pretty simple - so I think your assumption about it being simpler if you avoid it might be wrong. It might be easier if you use it! I have been looking into this and I am yet to find some decent documentation or a simple device example I can use to understand how to create such a device. Do you know of any reading or examples I can obtain on how to get an initial do nothing device up and running? -Geoff Dave > > thanks > -- PMM -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: RFC: New device for zero-copy VM memory access
On 2019-11-01 01:52, Peter Maydell wrote: On Thu, 31 Oct 2019 at 14:26, wrote: As the author of Looking Glass, I also have to consider the maintenance and the complexity of implementing the vhost protocol into the project. At this time a complete Porthole client can be implemented in 150 lines of C without external dependencies, and most of that is boilerplate socket code. This IMO is a major factor in deciding to avoid vhost-user. This is essentially a proposal that we should make our project and code more complicated so that your project and code can be simpler. I hope you can see why this isn't necessarily an argument that will hold very much weight for us :-) Certainly, I do which is why I am still going to see about using vhost, however, a device that uses vhost is likely more complex then the device as it stands right now and as such more maintenance would be involved on your end also. Or have I missed something in that vhost-user can be used directly as a device? thanks -- PMM
Re: RFC: New device for zero-copy VM memory access
s to write the Porthole driver which is far simpler than the IVSHMEM driver. I'd hate to think of the time investment in maintaining the vhost integration also (yes, I am aware there is a library). These drivers are not complex and I am sure an experienced windows driver developer could have thrown them together in a few hours, but since our requirements are so niche and of little commercial value those in our community that are using this project do not have the time and/or ability to assist with the drivers. From my point of view, avoiding vhost-use seems like a better path to take as I am able (and willing) to maintain the Porthole device in QEMU, the OS drivers, and client interface. The Porthole device also doesn't have any special or complex features keeping it very simple to maintain and keeping the client protocol very simple. There is also an open-source audio driver for windows called SCREAM that was initially designed for broadcasting audio over a network, however, it's authors have also implemented transport via shared RAM. While vhost-user would make much more sense here as vring buffers would be very useful, the barrier to entry is too high and as such the developers have instead opted to use the simple IVSHMEM device instead. That said, I will still have a deeper look into vhost-user but I hope the above shows the merits of this simple method of guest ram access. -Geoff Dave -Geoff > > This device and the windows driver have been designed in such a way > > that > > it's a > > utility device for any project and/or application that could make > > use of it. > > The PCI subsystem vendor and device ID are used to provide a means > > of device > > identification in cases where multiple devices may be in use for > > differing > > applications. This also allows one common driver to be used for any > > other > > projects wishing to build on this device. > > > > My ultimate goal is to get this to a state where it could be accepted > > upstream > > into Qemu at which point Looking Glass would be modified to use it > > instead > > of > > the IVSHMEM device. > > > > My git repository with the new device can be found at: > > https://github.com/gnif/qemu > > > > The new device is: > > https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c > > > > Looking Glass: > > https://looking-glass.hostfission.com/ > > > > The windows driver, while working, needs some cleanup before the > > source is > > published. I intend to maintain both this device and the windows > > driver > > including producing a signed Windows 10 driver if Redhat are > > unwilling or > > unable. > > > > Kind Regards, > > Geoffrey McRae > > > > HostFission > > https://hostfission.com > > > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: RFC: New device for zero-copy VM memory access
Another update to this that adds support for unmap notification. The device has also been renamed to `porthole` and now resides here: https://github.com/gnif/qemu/blob/master/hw/misc/porthole.c And here is the updated Linux test client. https://gist.github.com/gnif/77e7fb54604b42a1a98ecb8bf3d2cf46 -Geoff On 2019-10-31 13:55, ge...@hostfission.com wrote: Hi Dave, On 2019-10-31 05:52, Dr. David Alan Gilbert wrote: * ge...@hostfission.com (ge...@hostfission.com) wrote: Hi All, Over the past week, I have been working to come up with a solution to the memory transfer performance issues that hinder the Looking Glass Project. Currently Looking Glass works by using the IVSHMEM shared memory device which is fed by an application that captures the guest's video output. While this works it is sub-optimal because we first have to perform a CPU copy of the captured frame into shared RAM, and then back out again for display. Because the destination buffers are allocated by closed proprietary code (DirectX, or NVidia NvFBC) there is no way to have the frame placed directly into the IVSHMEM shared ram. This new device, currently named `introspection` (which needs a more suitable name, porthole perhaps?), provides a means of translating guest physical addresses to host virtual addresses, and finally to the host offsets in RAM for file-backed memory guests. It does this by means of a simple protocol over a unix socket (chardev) which is supplied the appropriate fd for the VM's system RAM. The guest (in this case, Windows), when presented with the address of a userspace buffer and size, will mlock the appropriate pages into RAM and pass guest physical addresses to the virtual device. Hi Geroggrey, I wonder if the same thing can be done by using the existing vhost-user mechanism. vhost-user is intended for implementing a virtio device outside of the qemu process; so it has a character device that qemu passes commands down to the other process, where qemu mostly passes commands via the virtio queues. To be able to read the virtio queues, the external process mmap's the same memory as the guest - it gets passed a 'set mem table' command by qemu that includes fd's for the RAM, and includes base/offset pairs saying that a particular chunk of RAM is mapped at a particular guest physical address. Whether or not you make use of virtio queues, I think the mechanism for the device to tell the external process the mappings might be what you're after. Dave While normally I would be all for re-using such code, the vhost-user while being very feature-complete from what I understand is overkill for our requirements. It will still allocate a communication ring and an events system that we will not be using. The goal of this device is to provide a dumb & simple method of sharing system ram, both for this project and for others that work on a simple polling mechanism, it is not intended to be an end-to-end solution like vhost-user is. If you still believe that vhost-user should be used, I will do what I can to implement it, but for such a simple device I honestly believe it is overkill. -Geoff This device and the windows driver have been designed in such a way that it's a utility device for any project and/or application that could make use of it. The PCI subsystem vendor and device ID are used to provide a means of device identification in cases where multiple devices may be in use for differing applications. This also allows one common driver to be used for any other projects wishing to build on this device. My ultimate goal is to get this to a state where it could be accepted upstream into Qemu at which point Looking Glass would be modified to use it instead of the IVSHMEM device. My git repository with the new device can be found at: https://github.com/gnif/qemu The new device is: https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c Looking Glass: https://looking-glass.hostfission.com/ The windows driver, while working, needs some cleanup before the source is published. I intend to maintain both this device and the windows driver including producing a signed Windows 10 driver if Redhat are unwilling or unable. Kind Regards, Geoffrey McRae HostFission https://hostfission.com -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: RFC: New device for zero-copy VM memory access
Hi Dave, On 2019-10-31 05:52, Dr. David Alan Gilbert wrote: * ge...@hostfission.com (ge...@hostfission.com) wrote: Hi All, Over the past week, I have been working to come up with a solution to the memory transfer performance issues that hinder the Looking Glass Project. Currently Looking Glass works by using the IVSHMEM shared memory device which is fed by an application that captures the guest's video output. While this works it is sub-optimal because we first have to perform a CPU copy of the captured frame into shared RAM, and then back out again for display. Because the destination buffers are allocated by closed proprietary code (DirectX, or NVidia NvFBC) there is no way to have the frame placed directly into the IVSHMEM shared ram. This new device, currently named `introspection` (which needs a more suitable name, porthole perhaps?), provides a means of translating guest physical addresses to host virtual addresses, and finally to the host offsets in RAM for file-backed memory guests. It does this by means of a simple protocol over a unix socket (chardev) which is supplied the appropriate fd for the VM's system RAM. The guest (in this case, Windows), when presented with the address of a userspace buffer and size, will mlock the appropriate pages into RAM and pass guest physical addresses to the virtual device. Hi Geroggrey, I wonder if the same thing can be done by using the existing vhost-user mechanism. vhost-user is intended for implementing a virtio device outside of the qemu process; so it has a character device that qemu passes commands down to the other process, where qemu mostly passes commands via the virtio queues. To be able to read the virtio queues, the external process mmap's the same memory as the guest - it gets passed a 'set mem table' command by qemu that includes fd's for the RAM, and includes base/offset pairs saying that a particular chunk of RAM is mapped at a particular guest physical address. Whether or not you make use of virtio queues, I think the mechanism for the device to tell the external process the mappings might be what you're after. Dave While normally I would be all for re-using such code, the vhost-user while being very feature-complete from what I understand is overkill for our requirements. It will still allocate a communication ring and an events system that we will not be using. The goal of this device is to provide a dumb & simple method of sharing system ram, both for this project and for others that work on a simple polling mechanism, it is not intended to be an end-to-end solution like vhost-user is. If you still believe that vhost-user should be used, I will do what I can to implement it, but for such a simple device I honestly believe it is overkill. -Geoff This device and the windows driver have been designed in such a way that it's a utility device for any project and/or application that could make use of it. The PCI subsystem vendor and device ID are used to provide a means of device identification in cases where multiple devices may be in use for differing applications. This also allows one common driver to be used for any other projects wishing to build on this device. My ultimate goal is to get this to a state where it could be accepted upstream into Qemu at which point Looking Glass would be modified to use it instead of the IVSHMEM device. My git repository with the new device can be found at: https://github.com/gnif/qemu The new device is: https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c Looking Glass: https://looking-glass.hostfission.com/ The windows driver, while working, needs some cleanup before the source is published. I intend to maintain both this device and the windows driver including producing a signed Windows 10 driver if Redhat are unwilling or unable. Kind Regards, Geoffrey McRae HostFission https://hostfission.com -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: RFC: New device for zero-copy VM memory access
The windows driver source is now available also. https://github.com/gnif/Porthole-Guest-Driver I have also opted to rename the device to 'porthole', hopefully this name is acceptable. On 2019-10-30 09:53, ge...@hostfission.com wrote: Just to follow this up, here is a sample client application for this device https://gist.github.com/gnif/77e7fb54604b42a1a98ecb8bf3d2cf46 -Geoff On 2019-10-30 01:31, ge...@hostfission.com wrote: Hi All, Over the past week, I have been working to come up with a solution to the memory transfer performance issues that hinder the Looking Glass Project. Currently Looking Glass works by using the IVSHMEM shared memory device which is fed by an application that captures the guest's video output. While this works it is sub-optimal because we first have to perform a CPU copy of the captured frame into shared RAM, and then back out again for display. Because the destination buffers are allocated by closed proprietary code (DirectX, or NVidia NvFBC) there is no way to have the frame placed directly into the IVSHMEM shared ram. This new device, currently named `introspection` (which needs a more suitable name, porthole perhaps?), provides a means of translating guest physical addresses to host virtual addresses, and finally to the host offsets in RAM for file-backed memory guests. It does this by means of a simple protocol over a unix socket (chardev) which is supplied the appropriate fd for the VM's system RAM. The guest (in this case, Windows), when presented with the address of a userspace buffer and size, will mlock the appropriate pages into RAM and pass guest physical addresses to the virtual device. This device and the windows driver have been designed in such a way that it's a utility device for any project and/or application that could make use of it. The PCI subsystem vendor and device ID are used to provide a means of device identification in cases where multiple devices may be in use for differing applications. This also allows one common driver to be used for any other projects wishing to build on this device. My ultimate goal is to get this to a state where it could be accepted upstream into Qemu at which point Looking Glass would be modified to use it instead of the IVSHMEM device. My git repository with the new device can be found at: https://github.com/gnif/qemu The new device is: https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c Looking Glass: https://looking-glass.hostfission.com/ The windows driver, while working, needs some cleanup before the source is published. I intend to maintain both this device and the windows driver including producing a signed Windows 10 driver if Redhat are unwilling or unable. Kind Regards, Geoffrey McRae HostFission https://hostfission.com
Re: RFC: New device for zero-copy VM memory access
Just to follow this up, here is a sample client application for this device https://gist.github.com/gnif/77e7fb54604b42a1a98ecb8bf3d2cf46 -Geoff On 2019-10-30 01:31, ge...@hostfission.com wrote: Hi All, Over the past week, I have been working to come up with a solution to the memory transfer performance issues that hinder the Looking Glass Project. Currently Looking Glass works by using the IVSHMEM shared memory device which is fed by an application that captures the guest's video output. While this works it is sub-optimal because we first have to perform a CPU copy of the captured frame into shared RAM, and then back out again for display. Because the destination buffers are allocated by closed proprietary code (DirectX, or NVidia NvFBC) there is no way to have the frame placed directly into the IVSHMEM shared ram. This new device, currently named `introspection` (which needs a more suitable name, porthole perhaps?), provides a means of translating guest physical addresses to host virtual addresses, and finally to the host offsets in RAM for file-backed memory guests. It does this by means of a simple protocol over a unix socket (chardev) which is supplied the appropriate fd for the VM's system RAM. The guest (in this case, Windows), when presented with the address of a userspace buffer and size, will mlock the appropriate pages into RAM and pass guest physical addresses to the virtual device. This device and the windows driver have been designed in such a way that it's a utility device for any project and/or application that could make use of it. The PCI subsystem vendor and device ID are used to provide a means of device identification in cases where multiple devices may be in use for differing applications. This also allows one common driver to be used for any other projects wishing to build on this device. My ultimate goal is to get this to a state where it could be accepted upstream into Qemu at which point Looking Glass would be modified to use it instead of the IVSHMEM device. My git repository with the new device can be found at: https://github.com/gnif/qemu The new device is: https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c Looking Glass: https://looking-glass.hostfission.com/ The windows driver, while working, needs some cleanup before the source is published. I intend to maintain both this device and the windows driver including producing a signed Windows 10 driver if Redhat are unwilling or unable. Kind Regards, Geoffrey McRae HostFission https://hostfission.com
RFC: New device for zero-copy VM memory access
Hi All, Over the past week, I have been working to come up with a solution to the memory transfer performance issues that hinder the Looking Glass Project. Currently Looking Glass works by using the IVSHMEM shared memory device which is fed by an application that captures the guest's video output. While this works it is sub-optimal because we first have to perform a CPU copy of the captured frame into shared RAM, and then back out again for display. Because the destination buffers are allocated by closed proprietary code (DirectX, or NVidia NvFBC) there is no way to have the frame placed directly into the IVSHMEM shared ram. This new device, currently named `introspection` (which needs a more suitable name, porthole perhaps?), provides a means of translating guest physical addresses to host virtual addresses, and finally to the host offsets in RAM for file-backed memory guests. It does this by means of a simple protocol over a unix socket (chardev) which is supplied the appropriate fd for the VM's system RAM. The guest (in this case, Windows), when presented with the address of a userspace buffer and size, will mlock the appropriate pages into RAM and pass guest physical addresses to the virtual device. This device and the windows driver have been designed in such a way that it's a utility device for any project and/or application that could make use of it. The PCI subsystem vendor and device ID are used to provide a means of device identification in cases where multiple devices may be in use for differing applications. This also allows one common driver to be used for any other projects wishing to build on this device. My ultimate goal is to get this to a state where it could be accepted upstream into Qemu at which point Looking Glass would be modified to use it instead of the IVSHMEM device. My git repository with the new device can be found at: https://github.com/gnif/qemu The new device is: https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c Looking Glass: https://looking-glass.hostfission.com/ The windows driver, while working, needs some cleanup before the source is published. I intend to maintain both this device and the windows driver including producing a signed Windows 10 driver if Redhat are unwilling or unable. Kind Regards, Geoffrey McRae HostFission https://hostfission.com
Re: Adding a memory alias breaks v-rings
Hi Phil On 2019-10-24 22:07, Philippe Mathieu-Daudé wrote: Hi Geoffrey, On 10/24/19 10:27 AM, ge...@hostfission.com wrote: Hi All, I have been working on adding a feature as a proof of concept to improve the performance of applications like Looking Glass by avoiding additional memory copies. My goal is to alias part of the IVSHMEM shared memory over a pointer provided by the guest OS capture API (DXGI Desktop Duplication or NVIDIA Frame Buffer Capture). I have managed to get this working by adding a few additional configuration registers to the IVSHMEM device and enhanced the IVSHMEM windows driver with suitable IOCTLs to set this all up. While this concept is backwards it needs to work this way as we do not have control over the destination buffer allocation by the GPU driver. This all works, however, it has exposed a bug (or I am doing things improperly) with the way that vhost tracks memory. When calling memory_region_add_subregion_overlap the memory listener in vhost fires triggering vhost_region_add_section. According to the comments this code depends on being called in memory address order, but because I am adding the alias region late, it's out of order, and also splits the upper memory region. This has the effect of corrupting/breaking one or more random vrings, as evidenced by the crash/hang of vhost-net or other virtio devices. I'm not sure this is the same issue I had before, but you might find Frederic and Alexey suggestions from this thread helpful: https://www.mail-archive.com/qemu-devel@nongnu.org/msg525833.html Also note vhost_region_add_section() you mentioned has this comment: if (need_add) { ... /* The flatview isn't stable and we don't use it, making it NULL * means we can memcmp the list. */ dev->tmp_sections[dev->n_tmp_sections - 1].fv = NULL; Maybe you need this change: -- >8 -- --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -642,6 +642,7 @@ static void vhost_region_add_section(struct vhost_dev *dev, */ dev->tmp_sections[dev->n_tmp_sections - 1].fv = NULL; memory_region_ref(section->mr); +memory_region_update_container_subregions(section->mr); } } --- Unfortunately not, `memory_region_update_container_subregions` is private in memory.c hangs the VM even if I expose it and call it as you suggested. It is also already called via memory_region_add_subregion_overlap anyway. Thanks for the suggestion though :) Regards, Phil. The following and errors are also logged regarding section alignment: qemu-system-x86_64: vhost_region_add_section:Section rounded to 3c000 prior to previous 3fc4f9000 qemu-system-x86_64: vhost_region_add_section:Section rounded to 3c000 prior to previous 3fc4f9000 Here is the flat view after the alias has been added. 0001-0003fc4f8fff (prio 0, ram): mem @8000 kvm 0003fc4f9000-0003fc4f9fff (prio 1, ram): ivshmem kvm 0003fc4fa000-00043fff (prio 0, ram): mem @00037c4fa000 kvm When the guest doesn't crash out due to the obvious corruption it is possible to verify that the alias is in the right place and fully functional. Unfortunately, I simply do not have enough of a grasp on vhost to understand exactly what is going on and how to correct it. Getting this feature working is highly desirable as it should be possible to obtain GPU -> GPU memory transfers between guests without requiring workstation/professional graphics cards. Kind Regards, Geoffrey McRae
Adding a memory alias breaks v-rings
Hi All, I have been working on adding a feature as a proof of concept to improve the performance of applications like Looking Glass by avoiding additional memory copies. My goal is to alias part of the IVSHMEM shared memory over a pointer provided by the guest OS capture API (DXGI Desktop Duplication or NVIDIA Frame Buffer Capture). I have managed to get this working by adding a few additional configuration registers to the IVSHMEM device and enhanced the IVSHMEM windows driver with suitable IOCTLs to set this all up. While this concept is backwards it needs to work this way as we do not have control over the destination buffer allocation by the GPU driver. This all works, however, it has exposed a bug (or I am doing things improperly) with the way that vhost tracks memory. When calling memory_region_add_subregion_overlap the memory listener in vhost fires triggering vhost_region_add_section. According to the comments this code depends on being called in memory address order, but because I am adding the alias region late, it's out of order, and also splits the upper memory region. This has the effect of corrupting/breaking one or more random vrings, as evidenced by the crash/hang of vhost-net or other virtio devices. The following and errors are also logged regarding section alignment: qemu-system-x86_64: vhost_region_add_section:Section rounded to 3c000 prior to previous 3fc4f9000 qemu-system-x86_64: vhost_region_add_section:Section rounded to 3c000 prior to previous 3fc4f9000 Here is the flat view after the alias has been added. 0001-0003fc4f8fff (prio 0, ram): mem @8000 kvm 0003fc4f9000-0003fc4f9fff (prio 1, ram): ivshmem kvm 0003fc4fa000-00043fff (prio 0, ram): mem @00037c4fa000 kvm When the guest doesn't crash out due to the obvious corruption it is possible to verify that the alias is in the right place and fully functional. Unfortunately, I simply do not have enough of a grasp on vhost to understand exactly what is going on and how to correct it. Getting this feature working is highly desirable as it should be possible to obtain GPU -> GPU memory transfers between guests without requiring workstation/professional graphics cards. Kind Regards, Geoffrey McRae
Re: [Qemu-devel] [RFC for-3.2 PATCH 0/7] pcie: Enhanced link speed and width support
I can confirm that these patches work as expected. Thank you kindly Alex for your hard work! Tested-by: Geoffrey McRae On 2018-11-15 07:50, Alex Williamson wrote: QEMU exposes gen1 PCI-express interconnect devices supporting only 2.5GT/s and x1 width. It might not seem obvious that a virtual bandwidth limitation can result in a real performance degradation, but it's been reported that in some configurations assigned GPUs might not scale their link speed up to the maximum supported value if the downstream port above it only advertises limited link support. As proposed[1] this series effectively implements virtual link negotiation on downstream ports and enhances the generic PCIe root port to allow user configurable speeds and widths. The "negotiation" simply mirrors the link status of the connected downstream device providing the appearance of dynamic link speed scaling to match the endpoint device. Not yet implemented from the proposal is support for globally updating defaults based on machine type, though the foundation is provided here by allowing supporting PCIESlots to implement an instance_init callback which can call into a common helper for this. I have not specifically tested migration with this, but we already consider LNKSTA to be dynamic and the other changes implemented here are static config space changes with no changes being implemented for devices using default values, ie. they should be compatible by virtue of existing config space migration support. I think I've covered the required link related registers to support PCIe 4.0, but please let me know if I've missed any. Testing and feedback appreciated, patch 6/7 provides example qemu:arg options and requirements to use with existing libvirt. Native libvirt support TBD. Thanks, Alex [1] https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03086.html --- Alex Williamson (7): pcie: Create enums for link speed and width pci: Sync PCIe downstream port LNKSTA on read qapi: Define PCIe link speed and width properties pcie: Add link speed and width fields to PCIESlot pcie: Fill PCIESlot link fields to support higher speeds and widths pcie: Allow generic PCIe root port to specify link speed and width vfio/pci: Remove PCIe Link Status emulation hw/core/qdev-properties.c | 178 hw/pci-bridge/gen_pcie_root_port.c |2 hw/pci-bridge/pcie_root_port.c | 14 +++ hw/pci/pci.c |4 + hw/pci/pcie.c | 118 +++- hw/vfio/pci.c |9 -- include/hw/pci/pci.h | 13 +++ include/hw/pci/pcie.h |1 include/hw/pci/pcie_port.h |4 + include/hw/pci/pcie_regs.h | 23 - include/hw/qdev-properties.h |8 ++ qapi/common.json | 42 12 files changed, 404 insertions(+), 12 deletions(-)
Re: [Qemu-devel] [PATCH 1/2] ps2: Clear the queue on PS/2 mouse reset and obey device disable
On 2018-05-07 22:41, Gerd Hoffmann wrote: On Mon, May 07, 2018 at 10:26:24PM +1000, geoff--- via Qemu-devel wrote: On 2018-05-07 22:21, Gerd Hoffmann wrote: > On Mon, May 07, 2018 at 10:00:22PM +1000, geoff--- via Qemu-devel wrote: > > This allows guest's to correctly reinitialize and identify the mouse > > should the guest decide to re-scan or reset during mouse input events. > > > > Signed-off-by: Geoffrey McRae > > --- > > hw/input/ps2.c | 4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/hw/input/ps2.c b/hw/input/ps2.c > > index 06f5d2ac4a..6edf046820 100644 > > --- a/hw/input/ps2.c > > +++ b/hw/input/ps2.c > > @@ -673,6 +673,9 @@ static void ps2_mouse_sync(DeviceState *dev) > > { > > PS2MouseState *s = (PS2MouseState *)dev; > > > > +if (!(s->mouse_status & MOUSE_STATUS_ENABLED)) > > +return; > > + > > Why this is needed? To quote: https://wiki.osdev.org/%228042%22_PS/2_Controller#Detecting_PS.2F2_Device_Types The device should respond to the "identify" command by sending a sequence of none, one or two identification bytes. However, if you just send the "identify" command you can't prevent the response from the "identify" command from being mixed up with keyboard/mouse data. To fix this problem, you need to send the "disable scanning" command first. Disabling scanning means that the device ignores the user (e.g. keyboards ignore keypresses, mice ignore mouse movement and button presses, etc) and won't send data to mess your device identification code up. Ok. Same check should be added to ps2_keyboard_event() then I guess? Quite correct, I will include this in the next patch set. cheers, Gerd
Re: [Qemu-devel] [PATCH 2/2] ps2: Fix mouse stream corruption due to lost data
On 2018-05-07 22:34, Gerd Hoffmann wrote: diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 6edf046820..011290920f 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -192,12 +192,50 @@ void ps2_queue(PS2State *s, int b) { PS2Queue *q = &s->queue; -if (q->count >= PS2_QUEUE_SIZE - 1) +if (q->count == PS2_QUEUE_SIZE) +{ +printf("Warning! PS2 Queue Overflow!\n"); return; +} Leftover debug printf? Correct :), I will remove it. +void ps2_raise(PS2State *s) +{ +s->update_irq(s->update_arg, 1); +} + +void ps2_queue_raise(PS2State *s, int b) +{ +ps2_queue(s, b); +s->update_irq(s->update_arg, 1); +} I'd suggest to keep the ps2_queue() name. Makes the patch much smaller and easier to review. Factor out the code to actually queue things to a new ps2_queue_noirq() function. +void ps2_queue_bytes(PS2State *s, const int length, ...) Ack. I'd prefer to not use vaargs here as gcc can't check the arguments then. Suggest to just have ps2_queue_{2,3,4}() helpers instead to queue multibyte messages. Ack. cheers, Gerd Thanks, Geoff
Re: [Qemu-devel] [PATCH 1/2] ps2: Clear the queue on PS/2 mouse reset and obey device disable
On 2018-05-07 22:21, Gerd Hoffmann wrote: On Mon, May 07, 2018 at 10:00:22PM +1000, geoff--- via Qemu-devel wrote: This allows guest's to correctly reinitialize and identify the mouse should the guest decide to re-scan or reset during mouse input events. Signed-off-by: Geoffrey McRae --- hw/input/ps2.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 06f5d2ac4a..6edf046820 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -673,6 +673,9 @@ static void ps2_mouse_sync(DeviceState *dev) { PS2MouseState *s = (PS2MouseState *)dev; +if (!(s->mouse_status & MOUSE_STATUS_ENABLED)) +return; + Why this is needed? To quote: https://wiki.osdev.org/%228042%22_PS/2_Controller#Detecting_PS.2F2_Device_Types The device should respond to the "identify" command by sending a sequence of none, one or two identification bytes. However, if you just send the "identify" command you can't prevent the response from the "identify" command from being mixed up with keyboard/mouse data. To fix this problem, you need to send the "disable scanning" command first. Disabling scanning means that the device ignores the user (e.g. keyboards ignore keypresses, mice ignore mouse movement and button presses, etc) and won't send data to mess your device identification code up. @@ -776,6 +779,7 @@ void ps2_write_mouse(void *opaque, int val) s->mouse_resolution = 2; s->mouse_status = 0; s->mouse_type = 0; +ps2_reset_queue(&s->common); Looks good. cheers, Gerd
[Qemu-devel] [PATCH 2/2] ps2: Fix mouse stream corruption due to lost data
This fixes an issue by adding bounds checking to multi-byte packets where the PS/2 mouse data stream may become corrupted due to data being discarded when the PS/2 ringbuffer is full. Interrupts for Multi-byte responses are postponed until the final byte has been queued. These changes fix a bug where windows guests drop the mouse device entirely requring the guest to be restarted. Signed-off-by: Geoffrey McRae --- hw/input/pckbd.c | 6 +-- hw/input/ps2.c | 160 +-- 2 files changed, 110 insertions(+), 56 deletions(-) diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c index f17f18e51b..004ea3466d 100644 --- a/hw/input/pckbd.c +++ b/hw/input/pckbd.c @@ -216,9 +216,9 @@ static uint64_t kbd_read_status(void *opaque, hwaddr addr, static void kbd_queue(KBDState *s, int b, int aux) { if (aux) -ps2_queue(s->mouse, b); +ps2_queue_raise(s->mouse, b); else -ps2_queue(s->kbd, b); +ps2_queue_raise(s->kbd, b); } static void outport_write(KBDState *s, uint32_t val) diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 6edf046820..011290920f 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -192,12 +192,50 @@ void ps2_queue(PS2State *s, int b) { PS2Queue *q = &s->queue; -if (q->count >= PS2_QUEUE_SIZE - 1) +if (q->count == PS2_QUEUE_SIZE) +{ +printf("Warning! PS2 Queue Overflow!\n"); return; +} + q->data[q->wptr] = b; if (++q->wptr == PS2_QUEUE_SIZE) q->wptr = 0; q->count++; +} + +void ps2_raise(PS2State *s) +{ +s->update_irq(s->update_arg, 1); +} + +void ps2_queue_raise(PS2State *s, int b) +{ +ps2_queue(s, b); +s->update_irq(s->update_arg, 1); +} + +void ps2_queue_bytes(PS2State *s, const int length, ...) +{ +PS2Queue *q = &s->queue; + +if (PS2_QUEUE_SIZE - q->count < length) { +printf("Unable to send %d bytes, buffer full\n", length); +return; +} + +va_list args; +va_start(args, length); + +for(int i = 0; i < length; ++i) +{ +q->data[q->wptr] = va_arg(args, int); +if (++q->wptr == PS2_QUEUE_SIZE) +q->wptr = 0; +q->count++; +} + +va_end(args); s->update_irq(s->update_arg, 1); } @@ -213,13 +251,13 @@ static void ps2_put_keycode(void *opaque, int keycode) if (keycode == 0xf0) { s->need_high_bit = true; } else if (s->need_high_bit) { -ps2_queue(&s->common, translate_table[keycode] | 0x80); +ps2_queue_raise(&s->common, translate_table[keycode] | 0x80); s->need_high_bit = false; } else { -ps2_queue(&s->common, translate_table[keycode]); +ps2_queue_raise(&s->common, translate_table[keycode]); } } else { -ps2_queue(&s->common, keycode); +ps2_queue_raise(&s->common, keycode); } } @@ -490,72 +528,80 @@ void ps2_write_keyboard(void *opaque, int val) case -1: switch(val) { case 0x00: -ps2_queue(&s->common, KBD_REPLY_ACK); +ps2_queue_raise(&s->common, KBD_REPLY_ACK); break; case 0x05: -ps2_queue(&s->common, KBD_REPLY_RESEND); +ps2_queue_raise(&s->common, KBD_REPLY_RESEND); break; case KBD_CMD_GET_ID: -ps2_queue(&s->common, KBD_REPLY_ACK); /* We emulate a MF2 AT keyboard here */ -ps2_queue(&s->common, KBD_REPLY_ID); if (s->translate) -ps2_queue(&s->common, 0x41); +ps2_queue_bytes(&s->common, 3, +KBD_REPLY_ACK, +KBD_REPLY_ID, +0x41); else -ps2_queue(&s->common, 0x83); +ps2_queue_bytes(&s->common, 3, +KBD_REPLY_ACK, +KBD_REPLY_ID, +0x83); break; case KBD_CMD_ECHO: -ps2_queue(&s->common, KBD_CMD_ECHO); +ps2_queue_raise(&s->common, KBD_CMD_ECHO); break; case KBD_CMD_ENABLE: s->scan_enabled = 1; -ps2_queue(&s->common, KBD_REPLY_ACK); +ps2_queue_raise(&s->common, KBD_REPLY_ACK); break; case KBD_CMD_SCANCODE: case KBD_CMD_SET_LEDS: case KBD_CMD_SET_RATE: s->common.write_cmd = val; -ps2_queue(&s->common, KBD_REPLY_ACK); +ps2_queue_raise(&s->common, KBD_REPLY_ACK); break; case KBD_CMD_RESET_DISABLE: ps2_reset_keyboard(s); s->scan_enabled = 0; -ps2_queue(&s->common, KBD_REPLY_ACK); +ps2_queue_raise(&s->common, KBD_REPLY_ACK); break; case KBD_CMD_RESET_ENABLE: ps2_reset_keyboard(s); s->scan_enabled = 1; -ps2_queue(&s->common, KBD_REPLY_ACK); +ps2_queue_ra
[Qemu-devel] [PATCH 1/2] ps2: Clear the queue on PS/2 mouse reset and obey device disable
This allows guest's to correctly reinitialize and identify the mouse should the guest decide to re-scan or reset during mouse input events. Signed-off-by: Geoffrey McRae --- hw/input/ps2.c | 4 1 file changed, 4 insertions(+) diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 06f5d2ac4a..6edf046820 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -673,6 +673,9 @@ static void ps2_mouse_sync(DeviceState *dev) { PS2MouseState *s = (PS2MouseState *)dev; +if (!(s->mouse_status & MOUSE_STATUS_ENABLED)) +return; + if (s->mouse_buttons) { qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); } @@ -776,6 +779,7 @@ void ps2_write_mouse(void *opaque, int val) s->mouse_resolution = 2; s->mouse_status = 0; s->mouse_type = 0; +ps2_reset_queue(&s->common); ps2_queue(&s->common, AUX_ACK); ps2_queue(&s->common, 0xaa); ps2_queue(&s->common, s->mouse_type); -- 2.14.2
Re: [Qemu-devel] [PATCH v7 0/9] i386: Enable TOPOEXT to support hyperthreading on AMD CPU
Works well for me, thanks! Tested-by: Geoffrey McRae On 2018-04-27 02:26, Babu Moger wrote: This series enables the TOPOEXT feature for AMD CPUs. This is required to support hyperthreading on kvm guests. This addresses the issues reported in these bugs: https://bugzilla.redhat.com/show_bug.cgi?id=1481253 https://bugs.launchpad.net/qemu/+bug/1703506 v7: Rebased on top of latest tree after 2.12 release and done few basic tests. There are no changes except for few minor hunks. Hopefully this gets pulled into 2.13 release. Please review, let me know of any feedback. v6: 1.Fixed problem with patch#4(Add new property to control cache info). The parameter legacy_cache should be "on" by default on machine type "pc-q35-2.10". This was found by Alexandr Iarygin. 2.Fixed the l3 cache size for EPYC based machines(patch#3). Also, fixed the number of logical processors sharing the cache(patch#6). Only L3 cache is shared by multiple cores but not L1 or L2. This was a bug while decoding. This was found by Geoffrey McRae and he verified the fix. v5: In this series I tried to address the feedback from Eduardo Habkost. The discussion thread is here. https://patchwork.kernel.org/patch/10299745/ The previous thread is here. http://patchwork.ozlabs.org/cover/884885/ Reason for these changes. The cache properties for AMD family of processors have changed from previous releases. We don't want to display the new information on the old family of processors as this might cause compatibility issues. Changes: 1.Based the patches on top of Eduardo's(patch#1) patch. Changed few things. Moved the Cache definitions to cpu.h file. Changed the CPUID_4 names to generic names. 2.Added a new propery "legacy-cache" in cpu object(patch#2). This can be used to display the old property even if the host supports the new cache properties. 3.Added cache information in X86CPUDefinition and CPUX86State 4.Patch 6-7 changed quite a bit from previous version does to new approach. 5.Addressed few issues with CPUID_8000_001d and CPUID_8000_001E. v4: 1.Removed the checks under cpuid 0x801D leaf(patch #2). These check are not necessary. Found this during internal review. 2.Added CPUID_EXT3_TOPOEXT feature for all the 17 family(patch #4). This was found by Kash Pande during his testing. 3.Removed th hardcoded cpuid xlevel and dynamically extended if CPUID_EXT3_TOPOEXT is supported(Suggested by Brijesh Singh). v3: 1.Removed the patch #1. Radim mentioned that original typo problem is in linux kernel header. qemu is just copying those files. 2.In previous version, I used the cpuid 4 definitions for AMDs cpuid leaf 0x801D. CPUID 4 is very intel specific and we dont want to expose those details under AMD. I have renamed some of these definitions as generic. These changes are in patch#1. Radim, let me know if this is what you intended. 3.Added assert to for core_id(Suggested by Radim Krčmář). 4.Changed the if condition under "L3 cache info"(Suggested by Gary Hook). 5.Addressed few more text correction and code cleanup(Suggested by Thomas Lendacky). v2: Fixed few more minor issues per Gary Hook's comments. Thank you Gary. Removed the patch#1. We need to handle the instruction cache associativity seperately. It varies based on the cpu family. I will comeback to that later. Added two more typo corrections in patch#1 and patch#5. v1: Stanislav Lanci posted few patches earlier. https://patchwork.kernel.org/patch/10040903/ Rebased his patches with few changes. 1.Spit the patches into two, separating cpuid functions 0x801D and 0x801E (Patch 2 and 3). 2.Removed the generic non-intel check and made a separate patch with some changes(Patch 5). 3.Fixed L3_N_SETS_AMD(from 4096 to 8192) based on CPUID_Fn801D_ECX_x03. Added 2 more patches. Patch 1. Fixes cache associativity. Patch 4. Adds TOPOEXT feature on AMD EPYC CPU. Babu Moger (8): i386: Add cache information in X86CPUDefinition i386: Initialize cache information for EPYC family processors i386: Add new property to control cache info i386: Use the statically loaded cache definitions i386: Populate AMD Processor Cache Information for cpuid 0x801D i386: Add support for CPUID_8000_001E for AMD i386: Enable TOPOEXT feature on AMD EPYC CPU i386: Remove generic SMT thread check Eduardo Habkost (1): i386: Helpers to encode cache information consistently include/hw/i386/pc.h | 4 + target/i386/cpu.c| 736 ++- target/i386/cpu.h| 66 + target/i386/kvm.c| 29 +- 4 files changed, 702 insertions(+), 133 deletions(-)
[Qemu-devel] Linux Guest Memory Performance
Hi All, I am having some very strange issues with Qemu and memory copy performance. It seems that when performing buffer -> buffer copies of 8MB or lower the performance is horrid. Test program: #include #include #include #include #include static inline uint64_t nanotime() { struct timespec time; clock_gettime(CLOCK_MONOTONIC_RAW, &time); return ((uint64_t)time.tv_sec * 1e9) + time.tv_nsec; } int main(int argc, char * argv[]) { const int s = atoi(argv[1]); int size = s * 1024 * 1024; char * buffer1 = malloc(size); char * buffer2 = malloc(size); uint64_t t = nanotime(); for(int i = 0; i < 1000; ++i) memcpy(buffer1, buffer2, size); printf("%2u MB = %f ms\n", s, ((float)(nanotime() - t) / 1000.0f) / 100.0f); free(buffer1); free(buffer2); return 0; } Compiled with: gcc main.c -O3 Native Output: # for I in `seq 1 32`; do ./a.out $I; done 1 MB = 0.026123 ms 2 MB = 0.048406 ms 3 MB = 0.073877 ms 4 MB = 0.096974 ms 5 MB = 0.115063 ms 6 MB = 0.139025 ms 7 MB = 0.163888 ms 8 MB = 0.187360 ms 9 MB = 0.203941 ms 10 MB = 0.227855 ms 11 MB = 0.251903 ms 12 MB = 0.279699 ms 13 MB = 0.296424 ms 14 MB = 0.315042 ms 15 MB = 0.340979 ms 16 MB = 0.358750 ms 17 MB = 0.382865 ms 18 MB = 0.403458 ms 19 MB = 0.426864 ms 20 MB = 0.448165 ms 21 MB = 0.473857 ms 22 MB = 0.493515 ms 23 MB = 0.520299 ms 24 MB = 0.538550 ms 25 MB = 0.566735 ms 26 MB = 0.588072 ms 27 MB = 0.612500 ms 28 MB = 0.633682 ms 29 MB = 0.659352 ms 30 MB = 0.690467 ms 31 MB = 0.698611 ms 32 MB = 0.721284 ms Guest Output: # for I in `seq 1 32`; do ./a.out $I; done 1 MB = 0.026120 ms 2 MB = 0.049053 ms 3 MB = 0.081695 ms 4 MB = 0.126873 ms 5 MB = 0.161380 ms 6 MB = 0.316972 ms 7 MB = 0.492851 ms 8 MB = 0.673696 ms 9 MB = 0.221208 ms 10 MB = 0.256582 ms 11 MB = 0.276354 ms 12 MB = 0.316020 ms 13 MB = 0.327643 ms 14 MB = 0.363536 ms 15 MB = 0.382575 ms 16 MB = 0.401538 ms 17 MB = 0.436602 ms 18 MB = 0.473452 ms 19 MB = 0.491850 ms 20 MB = 0.527252 ms 21 MB = 0.546229 ms 22 MB = 0.561816 ms 23 MB = 0.582428 ms 24 MB = 0.614430 ms 25 MB = 0.660698 ms 26 MB = 0.670087 ms 27 MB = 0.688908 ms 28 MB = 0.714887 ms 29 MB = 0.746829 ms 30 MB = 0.763404 ms 31 MB = 0.780527 ms 32 MB = 0.821888 ms Note that leading up to 8MB the copy is getting slower, but once the copy exceeds 8MB the copy is 3x faster. Does anyone have any insight as to why this might be? I am running master @ 11ed801d3df3c6e46b2f1f97dcfbf4ca3a2a2f4f Host: AMD Thread Ripper 1950x Guest launch parameters: /usr/local/bin/qemu-system-x86_64 \ -nographic \ -runas geoff \ -monitor stdio \ -name guest=Aeryn,debug-threads=on \ -machine q35,accel=kvm,usb=off,vmport=off,dump-guest-core=off \ -cpu host,hv_time,hv_relaxed,hv_vapic,hv_vendor_id=lakeuv283713,kvm=off \ -drive file=$DIR/ovmf/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on \ -drive file=$DIR/vars.fd,if=pflash,format=raw,unit=1 \ -m 8192 \ -mem-prealloc \ -mem-path /dev/hugepages/aeryn \ -realtime mlock=off \ -smp 32,sockets=1,cores=16,threads=2 \ -no-user-config \ -nodefaults \ -balloon none \ \ -global ICH9-LPC.disable_s3=1 \ -global ICH9-LPC.disable_s4=1 \ \ -rtc base=localtime,driftfix=slew \ -global kvm-pit.lost_tick_policy=discard \ -no-hpet \ \ -boot strict=on \ \ -object iothread,id=iothread1 \ -device virtio-scsi-pci,id=scsi1,iothread=iothread1 \ -drive if=none,id=hd0,file=/dev/moya/aeryn-efi,format=raw,aio=threads \ -device scsi-hd,bus=scsi1.0,drive=hd0,bootindex=1 \ -drive if=none,id=hd1,file=/dev/moya/aeryn-rootfs,format=raw,aio=threads \ -device scsi-hd,bus=scsi1.0,drive=hd1 \ \ -netdev tap,script=/home/geoff/VM/bin/ovs-ifup,downscript=/home/geoff/VM/bin/ovs-ifdown,ifname=aeryn.10,id=hostnet0 \ -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:06:12:34,bus=pcie.0 \ \ -device intel-hda,id=sound0,bus=pcie.0 \ -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 \ \ -device vfio-pci,host=0d:00.0,id=hostdev1,bus=pcie.0,addr=0x09,multifunction=on \ -device vfio-pci,host=0d:00.1,id=hostdev2,bus=pcie.0,addr=0x09.1' \ \ -device ivshmem-plain,memdev=ivshmem \ -object memory-backend-file,id=ivshmem,share=on,mem-path=/dev/shm/looking-glass,size=128M \ \ -msg timestamp=on \ \ -object input-linux,id=mou1,evdev=/dev/input/by-id/usb-Razer_Razer_DeathAdder_2013-event-mouse \ -object input-linux,id=mou2,evdev=/dev/input/by-id/usb-Razer_Razer_DeathAdder_2013-if01-event-kbd \ -object input-linux,id=mou3,evdev=/dev/input/by-id/usb-Razer_Razer_DeathAdder_2013-if02-event-kbd \ -object input-linux,id=kbd1,evdev=/dev/input/by-id/usb-04d9_daskeyboard-event-kbd,grab_all=on,repeat=on \ -object input-linux,id=kbd2,evdev=/dev/input/by-id/usb-04d9_daskeyboard-event-if01 Thanks in advance.
Re: [Qemu-devel] [PATCH v2 0/3] ivshmem: MSI bug fixes
I just updated to the latest build and applied this patch set, now on VM reset the qemu crashes with the following assert: ivshmem.c:467: ivshmem_add_kvm_msi_virq: Assertion `!s->msi_vectors[vector].pdev' failed. On 2017-11-15 18:31, Ladi Prosek wrote: Fixes bugs in the ivshmem device implementation uncovered with the new Windows ivshmem driver: https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem v1->v2: * Patch 1 - added reproducer info to commit message (Markus) * Patch 2 - restructured conditionals, fixed comment formatting (Markus) * Patch 3 - added reproducer info to commit message (Markus) Ladi Prosek (3): ivshmem: Don't update non-existent MSI routes ivshmem: Always remove irqfd notifiers ivshmem: Improve MSI irqfd error handling hw/misc/ivshmem.c | 77 +-- 1 file changed, 58 insertions(+), 19 deletions(-)
Re: [Qemu-devel] [PATCH 3/3] ivshmem: Improve MSI irqfd error handling
On 2017-11-14 04:27, Markus Armbruster wrote: Ladi Prosek writes: Adds a rollback path to ivshmem_enable_irqfd() and fixes ivshmem_disable_irqfd() to bail if irqfd has not been enabled. Signed-off-by: Ladi Prosek Is this a theoretical bug, or can you trigger it? It is reproducible, I can trigger it by simply unloading the windows driver and then attempting to re-load it. -Geoff
Re: [Qemu-devel] [PATCH 1/3] ivshmem: Don't update non-existent MSI routes
Thanks Ladi, I had not yet had time to dig into these, this patch set resolves all issues I was aware of. Tested-by: Geoffrey McRae On 2017-11-11 04:34, Ladi Prosek wrote: As of commit 660c97eef6f8 ("ivshmem: use kvm irqfd for msi notifications"), QEMU crashes with: kvm_irqchip_commit_routes: Assertion `ret == 0' failed. if the ivshmem device is configured with more vectors than what the server supports. This is caused by the ivshmem_vector_unmask() being called on vectors that have not been initialized by ivshmem_add_kvm_msi_virq(). This commit fixes it by adding a simple check to the mask and unmask callbacks. Note that the opposite mismatch, if the server supplies more vectors than what the device is configured for, is already handled and leads to output like: Too many eventfd received, device has 1 vectors Fixes: 660c97eef6f8 ("ivshmem: use kvm irqfd for msi notifications") Signed-off-by: Ladi Prosek --- hw/misc/ivshmem.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c index a5a46827fe..6e46669744 100644 --- a/hw/misc/ivshmem.c +++ b/hw/misc/ivshmem.c @@ -317,6 +317,10 @@ static int ivshmem_vector_unmask(PCIDevice *dev, unsigned vector, int ret; IVSHMEM_DPRINTF("vector unmask %p %d\n", dev, vector); +if (!v->pdev) { +error_report("ivshmem: vector %d route does not exist", vector); +return -EINVAL; +} ret = kvm_irqchip_update_msi_route(kvm_state, v->virq, msg, dev); if (ret < 0) { @@ -331,12 +335,16 @@ static void ivshmem_vector_mask(PCIDevice *dev, unsigned vector) { IVShmemState *s = IVSHMEM_COMMON(dev); EventNotifier *n = &s->peers[s->vm_id].eventfds[vector]; +MSIVector *v = &s->msi_vectors[vector]; int ret; IVSHMEM_DPRINTF("vector mask %p %d\n", dev, vector); +if (!v->pdev) { +error_report("ivshmem: vector %d route does not exist", vector); +return; +} -ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, n, - s->msi_vectors[vector].virq); +ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, n, v->virq); if (ret != 0) { error_report("remove_irqfd_notifier_gsi failed"); }
[Qemu-devel] PCI Passthrough + AMD + NPT
Hi All, I have started to dig into why ntp seems to slow down graphics performance on AMD systems using PCI passthrough and figured I would report what I have so far discovered. I have noted the primary point of failure seems to be specifically with PhysX. This is why people only see a slow down in certain games, not everything uses PhysX. Using FluidMark[1] the problem is immediately obvious, showing extremely low FPS on light/medium workloads with ntp enabled, and extreme fluididy and high FPS with ntp disabled. Switching nVidia to use CPU makes no difference to the performance when ntp is enabled, which seems to indicate that PhysX is falling back to CPU due to a failure of some kind to initialize. With ntp turned off, and nVidia set to use the CPU for PhysX I see an identical performance drop off in FluidMark as I see when ntp is enabled, this would seem to confirm this suspicion. Since other features such as APIC is only available if ntp is enabled, it could be something down stream of ntp that is getting disabled as a consequence of turning off ntp. It might be interesting to see if we can get some diagnostics information out of PhysX to see what if any error or debugging information it might provide when it falls back to CPU. 1: http://www.geeks3d.com/20130308/fluidmark-1-5-1-physx-benchmark-fluid-sph-simulation-opengl-download/ Kind Regards, Geoffrey McRae
Re: [Qemu-devel] ivshmem Windows Driver
On 2017-10-19 20:51, Ladi Prosek wrote: On Thu, Oct 19, 2017 at 11:41 AM, wrote: On 2017-10-19 20:07, ge...@hostfission.com wrote: On 2017-10-19 20:01, Ladi Prosek wrote: On Thu, Oct 19, 2017 at 10:44 AM, wrote: On 2017-10-19 19:35, Ladi Prosek wrote: On Wed, Oct 18, 2017 at 5:04 PM, wrote: Hi Ladi & Yan, I am pleased to present the completed driver for review, please see: https://github.com/gnif/kvm-guest-drivers-windows Awesome! Feel free to open pull request, it should be easier to comment on. Great, I will do so after I have addressed the below. Thanks again. I have created a PR, see: https://github.com/virtio-win/kvm-guest-drivers-windows/pull/174 * WoW considerations: It would be nice if the driver could detect that the map request is coming from a 32-bit process and expect a different layout of struct IVSHMEM_MMAP. I did think of this but I am unsure as to how to detect this. I don't think I ever used it but IoIs32bitProcess() looks promising. Obviously PVOID will be 32bit which will mess with the struct size and offset of vectors but I am not aware of a solution to this. If you have any suggestions on how to rectify this it would be very much appreciated. I was thinking something simple like: #ifdef _WIN64 typedef struct IVSHMEM_MMAP_32 { ... UINT32 ptr; ... } IVSHMEM_MMAP_32, *PIVSHMEM_MMAP_32; #endif in a private header. Then in the IOCTL handler call IoIs32bitProcess() and if it returns true, expect IVSHMEM_MMAP_32 instead of IVSHMEM_MMAP. Ah that makes sense, thanks! This has been done. * It would be cleaner to use READ_REGISTER_* and WRITE_REGISTER_* from/to IVSHMEMDeviceRegisters instead of plain memory accesses. Or at the very least the accesses should be marked volatile. I thought that mapping the IO space was enough for this since it is mapped as non-cacheable. I can see the point of marking it volatile but see no need to use the read/write register semantics. If this is what it takes however I am happy to do so. Code like this raises eyebrows: deviceContext->devRegisters->doorbell |= (UINT32)in->vector | (in->peerID << 16); Many readers will probably be wondering what exactly the compiler is allowed to do with this statement. May it end up ORing the lower and upper word separately, for example? OR [word ptr addr], in->vector OR [word ptr addr + 2], in->peerID And, by the way, is OR really what we want here? After double checking this you are dead right, the register is documented as write only. I will fix this. Done. * In ivshmem.inf: ManufacturerName="Red Hat, Inc." instead of "RedHat" No worries. * Is any of the API used by the driver Win10-only? Just curious, it's fine to build the driver only for Win10 for now even if it isn't. I have not tried to build it on anything older then win 10 build 10586 as I have nothing older, but AFAIK it should build on windows 8.1 or later just fine. This is more due to my lack of familiarity with Visual Studio, give me gcc and vim any day :). Gotcha, no worries, other versions can be tested later.
Re: [Qemu-devel] ivshmem Windows Driver
On 2017-10-19 20:07, ge...@hostfission.com wrote: On 2017-10-19 20:01, Ladi Prosek wrote: On Thu, Oct 19, 2017 at 10:44 AM, wrote: On 2017-10-19 19:35, Ladi Prosek wrote: On Wed, Oct 18, 2017 at 5:04 PM, wrote: Hi Ladi & Yan, I am pleased to present the completed driver for review, please see: https://github.com/gnif/kvm-guest-drivers-windows Awesome! Feel free to open pull request, it should be easier to comment on. Great, I will do so after I have addressed the below. Thanks again. I have created a PR, see: https://github.com/virtio-win/kvm-guest-drivers-windows/pull/174 * WoW considerations: It would be nice if the driver could detect that the map request is coming from a 32-bit process and expect a different layout of struct IVSHMEM_MMAP. I did think of this but I am unsure as to how to detect this. I don't think I ever used it but IoIs32bitProcess() looks promising. Obviously PVOID will be 32bit which will mess with the struct size and offset of vectors but I am not aware of a solution to this. If you have any suggestions on how to rectify this it would be very much appreciated. * It would be cleaner to use READ_REGISTER_* and WRITE_REGISTER_* from/to IVSHMEMDeviceRegisters instead of plain memory accesses. Or at the very least the accesses should be marked volatile. I thought that mapping the IO space was enough for this since it is mapped as non-cacheable. I can see the point of marking it volatile but see no need to use the read/write register semantics. If this is what it takes however I am happy to do so. Code like this raises eyebrows: deviceContext->devRegisters->doorbell |= (UINT32)in->vector | (in->peerID << 16); Many readers will probably be wondering what exactly the compiler is allowed to do with this statement. May it end up ORing the lower and upper word separately, for example? OR [word ptr addr], in->vector OR [word ptr addr + 2], in->peerID And, by the way, is OR really what we want here? After double checking this you are dead right, the register is documented as write only. I will fix this. Done. * In ivshmem.inf: ManufacturerName="Red Hat, Inc." instead of "RedHat" No worries. * Is any of the API used by the driver Win10-only? Just curious, it's fine to build the driver only for Win10 for now even if it isn't. I have not tried to build it on anything older then win 10 build 10586 as I have nothing older, but AFAIK it should build on windows 8.1 or later just fine. This is more due to my lack of familiarity with Visual Studio, give me gcc and vim any day :). Gotcha, no worries, other versions can be tested later.
Re: [Qemu-devel] ivshmem Windows Driver
On 2017-10-19 20:01, Ladi Prosek wrote: On Thu, Oct 19, 2017 at 10:44 AM, wrote: On 2017-10-19 19:35, Ladi Prosek wrote: On Wed, Oct 18, 2017 at 5:04 PM, wrote: Hi Ladi & Yan, I am pleased to present the completed driver for review, please see: https://github.com/gnif/kvm-guest-drivers-windows Awesome! Feel free to open pull request, it should be easier to comment on. Great, I will do so after I have addressed the below. Thanks again. * WoW considerations: It would be nice if the driver could detect that the map request is coming from a 32-bit process and expect a different layout of struct IVSHMEM_MMAP. I did think of this but I am unsure as to how to detect this. I don't think I ever used it but IoIs32bitProcess() looks promising. * It would be cleaner to use READ_REGISTER_* and WRITE_REGISTER_* from/to IVSHMEMDeviceRegisters instead of plain memory accesses. Or at the very least the accesses should be marked volatile. I thought that mapping the IO space was enough for this since it is mapped as non-cacheable. I can see the point of marking it volatile but see no need to use the read/write register semantics. If this is what it takes however I am happy to do so. Code like this raises eyebrows: deviceContext->devRegisters->doorbell |= (UINT32)in->vector | (in->peerID << 16); Many readers will probably be wondering what exactly the compiler is allowed to do with this statement. May it end up ORing the lower and upper word separately, for example? OR [word ptr addr], in->vector OR [word ptr addr + 2], in->peerID And, by the way, is OR really what we want here? After double checking this you are dead right, the register is documented as write only. I will fix this. * In ivshmem.inf: ManufacturerName="Red Hat, Inc." instead of "RedHat" No worries. * Is any of the API used by the driver Win10-only? Just curious, it's fine to build the driver only for Win10 for now even if it isn't. I have not tried to build it on anything older then win 10 build 10586 as I have nothing older, but AFAIK it should build on windows 8.1 or later just fine. This is more due to my lack of familiarity with Visual Studio, give me gcc and vim any day :). Gotcha, no worries, other versions can be tested later.
Re: [Qemu-devel] ivshmem Windows Driver
On 2017-10-19 19:35, Ladi Prosek wrote: On Wed, Oct 18, 2017 at 5:04 PM, wrote: Hi Ladi & Yan, I am pleased to present the completed driver for review, please see: https://github.com/gnif/kvm-guest-drivers-windows Awesome! Feel free to open pull request, it should be easier to comment on. Great, I will do so after I have addressed the below. Thanks again. * WoW considerations: It would be nice if the driver could detect that the map request is coming from a 32-bit process and expect a different layout of struct IVSHMEM_MMAP. I did think of this but I am unsure as to how to detect this. * It would be cleaner to use READ_REGISTER_* and WRITE_REGISTER_* from/to IVSHMEMDeviceRegisters instead of plain memory accesses. Or at the very least the accesses should be marked volatile. I thought that mapping the IO space was enough for this since it is mapped as non-cacheable. I can see the point of marking it volatile but see no need to use the read/write register semantics. If this is what it takes however I am happy to do so. * In ivshmem.inf: ManufacturerName="Red Hat, Inc." instead of "RedHat" No worries. * Is any of the API used by the driver Win10-only? Just curious, it's fine to build the driver only for Win10 for now even if it isn't. I have not tried to build it on anything older then win 10 build 10586 as I have nothing older, but AFAIK it should build on windows 8.1 or later just fine. This is more due to my lack of familiarity with Visual Studio, give me gcc and vim any day :). Thanks! All issues previously mentioned have been addressed and all missing functionality has been added. Please note that this work has exposed a bug in the qemu ivshmem virtual device itself, it seems that if the MSI interrupts are enabled and the driver is unloaded twice an assertion is thrown due to what looks to be a double free, crashing out qemu. Once this driver has been finalized I will look into the cause of this problem and see if I can correct it also. Kind Regards, Geoffrey McRae
Re: [Qemu-devel] ivshmem Windows Driver
Hi Ladi & Yan, I am pleased to present the completed driver for review, please see: https://github.com/gnif/kvm-guest-drivers-windows All issues previously mentioned have been addressed and all missing functionality has been added. Please note that this work has exposed a bug in the qemu ivshmem virtual device itself, it seems that if the MSI interrupts are enabled and the driver is unloaded twice an assertion is thrown due to what looks to be a double free, crashing out qemu. Once this driver has been finalized I will look into the cause of this problem and see if I can correct it also. Kind Regards, Geoffrey McRae
Re: [Qemu-devel] ivshmem Windows Driver
On 2017-10-18 17:50, Ladi Prosek wrote: On Wed, Oct 18, 2017 at 7:50 AM, wrote: On 2017-10-18 16:31, Ladi Prosek wrote: Hi Geoff, On Mon, Oct 16, 2017 at 8:31 PM, wrote: Hi Yan & Ladi. I have written an initial implementation that supports just the shared memory mapping at this time. I plan to add events also but before I go further I would like some feedback if possible on what I have implemented thus far. Please see: https://github.com/gnif/kvm-guest-drivers-windows/commit/8655cf12fbdd77b991f96d97bc20f967b5907c12 Thank you, looks good overall. * Please don't use the 'vio' prefix for this driver. ivshmem is not a VirtIO device (i.e. not using the VirtIO protocol). Also the test program should live in a subdirectory, so maybe something like /ivshmem and /ivshmem/test. Noted, I will remove the prefix throughout and move the test application. * In VIOIVSHMEMEvtDevicePrepareHardware: I don't think that Windows guarantees that resources are enumerated in BAR order. In VirtIO drivers we read the PCI config space to identify the BAR index: https://github.com/virtio-win/kvm-guest-drivers-windows/blob/master/VirtIO/VirtIOPCICommon.c#L353 The windows 'toaster' sample relies on the resource order, but as a belt and braces approach I will update the code to use the same approach. Interesting, thanks! If that's really the case then we can remove the code from VirtioLib. I have cloned the latest Windows-driver-samples but can't find this under general/toaster. Namely ToasterEvtDevicePrepareHardware just prints some info about all resources but does not do anything order-related. Can you point me to the right code? Sorry, my mistake, it wasn't the toaster code but the kmdf driver, it assumes the BAR ordering to determine which is which. https://github.com/Microsoft/Windows-driver-samples/blob/aa6e0b36eb932099fa4eb950a6f5e289a23b6d6e/general/pcidrv/kmdf/HW/nic_init.c#L649 * IOCTL codes on Windows have a structure to them: https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/defining-i-o-control-codes Thanks, I will fix this. * In VIOIVSHMEMEvtIoDeviceControl: The "only one mapping at a time is allowed" test has a race. I think that simply making the IO queue WdfIoQueueDispatchSequential instead of WdfIoQueueDispatchParallel will fix it. Good point, I will change this. * According to MSDN, MmMapLockedPagesSpecifyCache(UserMode) should be wrapped in try/except. Also, what happens if the file handle is inherited by a child process? Can it unmap the mapping in parent's address space? What if the parent exits? A possible solution is discussed in this article: http://www.osronline.com/article.cfm?article=39 Noted re try/except. As for a child inheriting it, the owner is tracked by the WDFFILEOBJECT, which the child I believe will inherit also, which would mean that the child would gain the ability to issue IOCTLs to the mapping. Thanks! Ladi No, thank you! I am grateful someone is willing to provide some feedback on this. I have been working on adding MSI interrupt support to the driver also which is close to ready, just trying to figure out why the driver fails to start with STATUS_DEVICE_POWER_FAILURE when I try to setup the IRQs with WdfInterruptCreate. Thanks again, Geoff
Re: [Qemu-devel] ivshmem Windows Driver
On 2017-10-18 16:31, Ladi Prosek wrote: Hi Geoff, On Mon, Oct 16, 2017 at 8:31 PM, wrote: Hi Yan & Ladi. I have written an initial implementation that supports just the shared memory mapping at this time. I plan to add events also but before I go further I would like some feedback if possible on what I have implemented thus far. Please see: https://github.com/gnif/kvm-guest-drivers-windows/commit/8655cf12fbdd77b991f96d97bc20f967b5907c12 Thank you, looks good overall. * Please don't use the 'vio' prefix for this driver. ivshmem is not a VirtIO device (i.e. not using the VirtIO protocol). Also the test program should live in a subdirectory, so maybe something like /ivshmem and /ivshmem/test. Noted, I will remove the prefix throughout and move the test application. * In VIOIVSHMEMEvtDevicePrepareHardware: I don't think that Windows guarantees that resources are enumerated in BAR order. In VirtIO drivers we read the PCI config space to identify the BAR index: https://github.com/virtio-win/kvm-guest-drivers-windows/blob/master/VirtIO/VirtIOPCICommon.c#L353 The windows 'toaster' sample relies on the resource order, but as a belt and braces approach I will update the code to use the same approach. * IOCTL codes on Windows have a structure to them: https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/defining-i-o-control-codes Thanks, I will fix this. * In VIOIVSHMEMEvtIoDeviceControl: The "only one mapping at a time is allowed" test has a race. I think that simply making the IO queue WdfIoQueueDispatchSequential instead of WdfIoQueueDispatchParallel will fix it. Good point, I will change this. * According to MSDN, MmMapLockedPagesSpecifyCache(UserMode) should be wrapped in try/except. Also, what happens if the file handle is inherited by a child process? Can it unmap the mapping in parent's address space? What if the parent exits? A possible solution is discussed in this article: http://www.osronline.com/article.cfm?article=39 Noted re try/except. As for a child inheriting it, the owner is tracked by the WDFFILEOBJECT, which the child I believe will inherit also, which would mean that the child would gain the ability to issue IOCTLs to the mapping. Thanks! Ladi No, thank you! I am grateful someone is willing to provide some feedback on this. I have been working on adding MSI interrupt support to the driver also which is close to ready, just trying to figure out why the driver fails to start with STATUS_DEVICE_POWER_FAILURE when I try to setup the IRQs with WdfInterruptCreate. Thanks again, Geoff
Re: [Qemu-devel] ivshmem Windows Driver
Hi Yan & Ladi. I have written an initial implementation that supports just the shared memory mapping at this time. I plan to add events also but before I go further I would like some feedback if possible on what I have implemented thus far. Please see: https://github.com/gnif/kvm-guest-drivers-windows/commit/8655cf12fbdd77b991f96d97bc20f967b5907c12 Kind Regards, Geoff On 2017-10-15 23:29, ge...@hostfission.com wrote: On 2017-10-15 23:24, Yan Vugenfirer wrote: On 15 Oct 2017, at 15:21, ge...@hostfission.com wrote: Hi Yan, Thank you for the information. I am rather new to Windows Driver development and learning as I go, so this may take some time, but since the driver only needs to perform very basic functions I do not see this as being too much of a challenge. I think you can look into Windows virtio-balloon implementation as an example of simple driver: https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/Balloon It relies on virtio library (https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/VirtIO) and it is WDF driver (MS framework that simplifies the drivers development) that makes it very simple. Thanks again, I already have a prototype driver working using WDF, it's more learning the windows internals and how to best implement things. -Geoff On 2017-10-15 22:14, Yan Vugenfirer wrote: He Geoff, The official virtio-win drivers upstream repository is here: https://github.com/virtio-win/kvm-guest-drivers-windows 1. There is no ivshmem Windows Driver for now as far as I know 2. We are signing the drivers for community usage https://fedoraproject.org/wiki/Windows_Virtio_Drivers from the same repository. The process will be: submit the code for review with pull request (better use existing virtio library for virtio communication between the guest and the host), pass internal tests and at the least being able to pass MS HCK\HLK tests, later on the driver will be pulled into official build and release with rest of the drivers for community usage. 3. We are happy to cooperate on adding new functionality to current package of virtio drivers for Windows 4. As already mentioned: https://github.com/virtio-win/kvm-guest-drivers-windows Thanks a lot! If you have more questions, please don’t hesitate to talk to me, Ladi or anyone else from Red Hat involved with virtio-win development. Best regards, Yan. On 15 Oct 2017, at 12:32, geoff--- via Qemu-devel wrote: Hi All, I am writing some code that needs to share a block of ram between a Windows guest and Linux host. For this I am using the ivshmem device and I have written a very primitive driver for windows that allows a single application to request to memory map the pci bar (shared memory) into the program's context using DeviceIoControl. This is all working fine, but the next problem is I need the driver to be signed. In it's current state I would not even suggest it be signed as it was just hacked together to test my concept, but now I know it's viable I would be willing to invest whatever time is required to write a driver that would be acceptable for signing. The ideal driver would be general purpose and could be leveraged for any user mode application use, not just my specific case. It would need to implement the IRQ/even features of ivshmem and possibly even some kind of security to prevent unauthorized use by rogue applications (shared secret configured on the chardev?). I have several qustions: 1) Has someone done this? I can't find any reference to a windows driver for this device anywhere. 2) If I was to pursue writing this driver, how would be the best way to go about it so as to ensure that it is in a state that it could be signed with the RedHat vendor key? 3) What is the likelihood of having such a driver signed? 4) Is there a preferred git host for such a driver? Kind Regards -Geoff
[Qemu-devel] ivshmem Windows Driver
Hi All, I am writing some code that needs to share a block of ram between a Windows guest and Linux host. For this I am using the ivshmem device and I have written a very primitive driver for windows that allows a single application to request to memory map the pci bar (shared memory) into the program's context using DeviceIoControl. This is all working fine, but the next problem is I need the driver to be signed. In it's current state I would not even suggest it be signed as it was just hacked together to test my concept, but now I know it's viable I would be willing to invest whatever time is required to write a driver that would be acceptable for signing. The ideal driver would be general purpose and could be leveraged for any user mode application use, not just my specific case. It would need to implement the IRQ/even features of ivshmem and possibly even some kind of security to prevent unauthorized use by rogue applications (shared secret configured on the chardev?). I have several qustions: 1) Has someone done this? I can't find any reference to a windows driver for this device anywhere. 2) If I was to pursue writing this driver, how would be the best way to go about it so as to ensure that it is in a state that it could be signed with the RedHat vendor key? 3) What is the likelihood of having such a driver signed? 4) Is there a preferred git host for such a driver? Kind Regards -Geoff
[Qemu-devel] [PATCH] [pckbd] Prevent IRQs when the guest disables the mouse
When the guest OS needs to send the mouse commands it will at least in the case of Windows 10 set the KBD_MODE_DISABLE_MOUSE bit to prevent interrupts from causing stream desynchronisation. Here is Windows 10 attempting to issue a PS/2 mouse reset without this fix where you can see the mouse positional data was returned as the answer to the get type command. KBD: kbd: write cmd=0xd4 // write next cmd to the aux port KBD: kbd: read status=0x1c KBD: kbd: read status=0x1c KBD: kbd: read status=0x1c KBD: kbd: write data=0xff kbd: write mouse 0xff // reset KBD: kbd: read status=0x3d KBD: kbd: read status=0x3d KBD: kbd: read data=0xfa // ack KBD: kbd: read status=0x3d KBD: kbd: read status=0x3d KBD: kbd: read data=0xaa // self-test good KBD: kbd: read status=0x3d KBD: kbd: read status=0x3d KBD: kbd: read data=0x00 // the device type KBD: kbd: read status=0x3d KBD: kbd: write cmd=0xd4 // write cmd to the aux port KBD: kbd: read status=0x3d KBD: kbd: write data=0xf2 kbd: write mouse 0xf2 // get type KBD: kbd: read status=0x3d KBD: kbd: read status=0x3d KBD: kbd: read data=0x08 // mouse data byte 1 KBD: kbd: read status=0x3d KBD: kbd: read status=0x3d KBD: kbd: read data=0x00 // mouse data byte 2 KBD: kbd: read status=0x3d KBD: kbd: read status=0x3d KBD: kbd: read data=0x00 // mouse data byte 3 KBD: kbd: read status=0x3d KBD: kbd: read status=0x3d KBD: kbd: read data=0xfa // the ack for the get type above KBD: kbd: read status=0x3d KBD: kbd: read status=0x3d KBD: kbd: read data=0x00 // the device type KBD: kbd: read status=0x3d KBD: kbd: read status=0x3d KBD: kbd: read data=0x08 // mouse data byte 1 KBD: kbd: read status=0x3d KBD: kbd: read status=0x3d KBD: kbd: read data=0x00 // mouse data byte 2 Signed-off-by: Geoffrey McRae --- hw/input/pckbd.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c index c479f827b6..78d5356817 100644 --- a/hw/input/pckbd.c +++ b/hw/input/pckbd.c @@ -168,7 +168,8 @@ static void kbd_update_irq(KBDState *s) if (s->pending == KBD_PENDING_AUX) { s->status |= KBD_STAT_MOUSE_OBF; s->outport |= KBD_OUT_MOUSE_OBF; -if (s->mode & KBD_MODE_MOUSE_INT) +if ((s->mode & KBD_MODE_MOUSE_INT) && +!(s->mode & KBD_MODE_DISABLE_MOUSE)) irq_mouse_level = 1; } else { if ((s->mode & KBD_MODE_KBD_INT) && -- 2.11.0