Re: RFC: New device for zero-copy VM memory access

2019-11-04 Thread geoff




On 2019-11-04 22:55, Dr. David Alan Gilbert wrote:

* ge...@hostfission.com (ge...@hostfission.com) wrote:



On 2019-11-03 21:10, ge...@hostfission.com wrote:
> On 2019-11-01 02:52, Dr. David Alan Gilbert wrote:
> > * ge...@hostfission.com (ge...@hostfission.com) wrote:
> > >
> > >
> > > On 2019-11-01 01:52, Peter Maydell wrote:
> > > > On Thu, 31 Oct 2019 at 14:26,  wrote:
> > > > > As the author of Looking Glass, I also have to consider the
> > > > > maintenance
> > > > > and the complexity of implementing the vhost protocol into the
> > > > > project.
> > > > > At this time a complete Porthole client can be implemented in 150
> > > > > lines
> > > > > of C without external dependencies, and most of that is boilerplate
> > > > > socket code. This IMO is a major factor in deciding to avoid
> > > > > vhost-user.
> > > >
> > > > This is essentially a proposal that we should make our project and
> > > > code more complicated so that your project and code can be simpler.
> > > > I hope you can see why this isn't necessarily an argument that will hold
> > > > very much weight for us :-)
> > >
> > > Certainly, I do which is why I am still going to see about using
> > > vhost,
> > > however, a device that uses vhost is likely more complex then
> > > the device
> > > as it stands right now and as such more maintenance would be
> > > involved on
> > > your end also. Or have I missed something in that vhost-user can
> > > be used
> > > directly as a device?
> >
> > The basic vhost-user stuff isn't actually that hard;  if you aren't
> > actually shuffling commands over the queues you should find it pretty
> > simple - so I think your assumption about it being simpler if you
> > avoid
> > it might be wrong.  It might be easier if you use it!
>
> I have been looking into this and I am yet to find some decent
> documentation or a simple device example I can use to understand how to
> create such a device. Do you know of any reading or examples I can
> obtain
> on how to get an initial do nothing device up and running?
>
> -Geoff

Scratch that, the design just solidified for me and I am now making
progress, however it seems that vhost-user can't do what we need here:

1) I dont see any way to recieve notification of socket disconnection, 
in

our use case the client app needs to be able to be (re)connected
dynamically. It might be possible to get this event by registering it 
on

the chardev manually but this seems like it would be a kludge.


My understanding was that someone added support for reconnection of
vhost-user;  I'm not sure of the detail - cc'ing in Maxime and
Marc-Andre.


2) I don't see any method of notifying the vhost-user client of the
removal of a shared memory mapping. Again, these may not be 
persistently
mapped in the guest as we have no control over the buffer allocation, 
and
as such, we need a method to notify the client that the mapping has 
become

invalid.

3) VHOST_USER_SET_MEM_TABLE is a one time request, again this breaks 
our

usage as we need to change this dynamically at runtime.


I've seen (3) being sent multiple times (It's messy but it happens); so
I think that fixes (2) as well for you.


Yes, but it's ignored.

/*
 * For non-vring specific requests, like VHOST_USER_SET_MEM_TABLE,
 * we just need send it once in the first time. For later such
 * request, we just ignore it.
 */
if (vhost_user_one_time_request(msg->hdr.request) && dev->vq_index 
!= 0) {

 msg->hdr.flags &= ~VHOST_USER_NEED_REPLY_MASK;
 return 0;
}



Dave

Unless there are viable solutions to these problems there is no way 
that

vhost-user can be used for this kind of a device.

-Geoff

>
> >
> > Dave
> >
> > > >
> > > > thanks
> > > > -- PMM
> > --
> > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: RFC: New device for zero-copy VM memory access

2019-11-04 Thread geoff




On 2019-11-04 21:26, Gerd Hoffmann wrote:

Hi,


This new device, currently named `introspection` (which needs a more
suitable name, porthole perhaps?), provides a means of translating
guest physical addresses to host virtual addresses, and finally to the
host offsets in RAM for file-backed memory guests. It does this by
means of a simple protocol over a unix socket (chardev) which is
supplied the appropriate fd for the VM's system RAM. The guest (in
this case, Windows), when presented with the address of a userspace
buffer and size, will mlock the appropriate pages into RAM and pass
guest physical addresses to the virtual device.


So, if I understand things correctly, the workflow looks like this:

  (1) guest allocates buffers, using guest ram.
  (2) guest uses these buffers as render target for the gpu
(pci-assigned I guess?).
  (3) guest passes guest physical address to qemu (via porthole 
device).

  (4) qemu translates gpa into file offset and passes offsets to
  the client application.
  (5) client application maps all guest ram, then uses the offsets from
  qemu to find the buffers.  Then goes displaying these buffers I 
guess.


Correct?


Correct, however step 5 might be a proxy to copy the buffers into 
another

porthole device in a second VM allowing VM->VM transfers.



Performance aside for now, is it an option for your use case to simply
use both an emulated display device and the assigned gpu, then 
configure

screen mirroring inside the guest to get the guest display scanned out
to the host?


Unfortunately no, NVidia and AMD devices do not support mirroring their
outputs to a separate GPU unless it's a professional-grade GPU such as a
Quadro or Firepro.

-Geoff



cheers,
  Gerd




Re: RFC: New device for zero-copy VM memory access

2019-11-03 Thread geoff




On 2019-11-03 21:10, ge...@hostfission.com wrote:

On 2019-11-01 02:52, Dr. David Alan Gilbert wrote:

* ge...@hostfission.com (ge...@hostfission.com) wrote:



On 2019-11-01 01:52, Peter Maydell wrote:
> On Thu, 31 Oct 2019 at 14:26,  wrote:
> > As the author of Looking Glass, I also have to consider the
> > maintenance
> > and the complexity of implementing the vhost protocol into the
> > project.
> > At this time a complete Porthole client can be implemented in 150
> > lines
> > of C without external dependencies, and most of that is boilerplate
> > socket code. This IMO is a major factor in deciding to avoid
> > vhost-user.
>
> This is essentially a proposal that we should make our project and
> code more complicated so that your project and code can be simpler.
> I hope you can see why this isn't necessarily an argument that will hold
> very much weight for us :-)

Certainly, I do which is why I am still going to see about using 
vhost,
however, a device that uses vhost is likely more complex then the 
device
as it stands right now and as such more maintenance would be involved 
on
your end also. Or have I missed something in that vhost-user can be 
used

directly as a device?


The basic vhost-user stuff isn't actually that hard;  if you aren't
actually shuffling commands over the queues you should find it pretty
simple - so I think your assumption about it being simpler if you 
avoid

it might be wrong.  It might be easier if you use it!


I have been looking into this and I am yet to find some decent
documentation or a simple device example I can use to understand how to
create such a device. Do you know of any reading or examples I can 
obtain

on how to get an initial do nothing device up and running?

-Geoff


Scratch that, the design just solidified for me and I am now making
progress, however it seems that vhost-user can't do what we need here:

1) I dont see any way to recieve notification of socket disconnection, 
in

our use case the client app needs to be able to be (re)connected
dynamically. It might be possible to get this event by registering it on
the chardev manually but this seems like it would be a kludge.

2) I don't see any method of notifying the vhost-user client of the
removal of a shared memory mapping. Again, these may not be persistently
mapped in the guest as we have no control over the buffer allocation, 
and
as such, we need a method to notify the client that the mapping has 
become

invalid.

3) VHOST_USER_SET_MEM_TABLE is a one time request, again this breaks our
usage as we need to change this dynamically at runtime.

Unless there are viable solutions to these problems there is no way that
vhost-user can be used for this kind of a device.

-Geoff





Dave


>
> thanks
> -- PMM

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: RFC: New device for zero-copy VM memory access

2019-11-03 Thread geoff




On 2019-11-01 02:52, Dr. David Alan Gilbert wrote:

* ge...@hostfission.com (ge...@hostfission.com) wrote:



On 2019-11-01 01:52, Peter Maydell wrote:
> On Thu, 31 Oct 2019 at 14:26,  wrote:
> > As the author of Looking Glass, I also have to consider the
> > maintenance
> > and the complexity of implementing the vhost protocol into the
> > project.
> > At this time a complete Porthole client can be implemented in 150
> > lines
> > of C without external dependencies, and most of that is boilerplate
> > socket code. This IMO is a major factor in deciding to avoid
> > vhost-user.
>
> This is essentially a proposal that we should make our project and
> code more complicated so that your project and code can be simpler.
> I hope you can see why this isn't necessarily an argument that will hold
> very much weight for us :-)

Certainly, I do which is why I am still going to see about using 
vhost,
however, a device that uses vhost is likely more complex then the 
device
as it stands right now and as such more maintenance would be involved 
on
your end also. Or have I missed something in that vhost-user can be 
used

directly as a device?


The basic vhost-user stuff isn't actually that hard;  if you aren't
actually shuffling commands over the queues you should find it pretty
simple - so I think your assumption about it being simpler if you avoid
it might be wrong.  It might be easier if you use it!


I have been looking into this and I am yet to find some decent
documentation or a simple device example I can use to understand how to
create such a device. Do you know of any reading or examples I can 
obtain

on how to get an initial do nothing device up and running?

-Geoff



Dave


>
> thanks
> -- PMM

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: RFC: New device for zero-copy VM memory access

2019-10-31 Thread geoff




On 2019-11-01 01:52, Peter Maydell wrote:

On Thu, 31 Oct 2019 at 14:26,  wrote:
As the author of Looking Glass, I also have to consider the 
maintenance
and the complexity of implementing the vhost protocol into the 
project.
At this time a complete Porthole client can be implemented in 150 
lines

of C without external dependencies, and most of that is boilerplate
socket code. This IMO is a major factor in deciding to avoid 
vhost-user.


This is essentially a proposal that we should make our project and
code more complicated so that your project and code can be simpler.
I hope you can see why this isn't necessarily an argument that will 
hold

very much weight for us :-)


Certainly, I do which is why I am still going to see about using vhost,
however, a device that uses vhost is likely more complex then the device
as it stands right now and as such more maintenance would be involved on
your end also. Or have I missed something in that vhost-user can be used
directly as a device?



thanks
-- PMM




Re: RFC: New device for zero-copy VM memory access

2019-10-31 Thread geoff
s to write the Porthole driver which is far simpler than the
IVSHMEM driver. I'd hate to think of the time investment in maintaining
the vhost integration also (yes, I am aware there is a library).

These drivers are not complex and I am sure an experienced windows
driver developer could have thrown them together in a few hours, but
since our requirements are so niche and of little commercial value those
in our community that are using this project do not have the time and/or
ability to assist with the drivers.

From my point of view, avoiding vhost-use seems like a better path to
take as I am able (and willing) to maintain the Porthole device in QEMU,
the OS drivers, and client interface. The Porthole device also doesn't
have any special or complex features keeping it very simple to maintain
and keeping the client protocol very simple.

There is also an open-source audio driver for windows called SCREAM
that was initially designed for broadcasting audio over a network,
however, it's authors have also implemented transport via shared RAM.
While vhost-user would make much more sense here as vring buffers
would be very useful, the barrier to entry is too high and as such the
developers have instead opted to use the simple IVSHMEM device instead.

That said, I will still have a deeper look into vhost-user but I hope
the above shows the merits of this simple method of guest ram access.

-Geoff



Dave


-Geoff

> > This device and the windows driver have been designed in such a way
> > that
> > it's a
> > utility device for any project and/or application that could make
> > use of it.
> > The PCI subsystem vendor and device ID are used to provide a means
> > of device
> > identification in cases where multiple devices may be in use for
> > differing
> > applications. This also allows one common driver to be used for any
> > other
> > projects wishing to build on this device.
> >
> > My ultimate goal is to get this to a state where it could be accepted
> > upstream
> > into Qemu at which point Looking Glass would be modified to use it
> > instead
> > of
> > the IVSHMEM device.
> >
> > My git repository with the new device can be found at:
> > https://github.com/gnif/qemu
> >
> > The new device is:
> > https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c
> >
> > Looking Glass:
> > https://looking-glass.hostfission.com/
> >
> > The windows driver, while working, needs some cleanup before the
> > source is
> > published. I intend to maintain both this device and the windows
> > driver
> > including producing a signed Windows 10 driver if Redhat are
> > unwilling or
> > unable.
> >
> > Kind Regards,
> > Geoffrey McRae
> >
> > HostFission
> > https://hostfission.com
> >
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: RFC: New device for zero-copy VM memory access

2019-10-31 Thread geoff
Another update to this that adds support for unmap notification. The 
device

has also been renamed to `porthole` and now resides here:

https://github.com/gnif/qemu/blob/master/hw/misc/porthole.c

And here is the updated Linux test client.

https://gist.github.com/gnif/77e7fb54604b42a1a98ecb8bf3d2cf46

-Geoff

On 2019-10-31 13:55, ge...@hostfission.com wrote:

Hi Dave,

On 2019-10-31 05:52, Dr. David Alan Gilbert wrote:

* ge...@hostfission.com (ge...@hostfission.com) wrote:

Hi All,

Over the past week, I have been working to come up with a solution to 
the
memory transfer performance issues that hinder the Looking Glass 
Project.


Currently Looking Glass works by using the IVSHMEM shared memory 
device

which
is fed by an application that captures the guest's video output. 
While this
works it is sub-optimal because we first have to perform a CPU copy 
of the
captured frame into shared RAM, and then back out again for display. 
Because
the destination buffers are allocated by closed proprietary code 
(DirectX,

or
NVidia NvFBC) there is no way to have the frame placed directly into 
the

IVSHMEM shared ram.

This new device, currently named `introspection` (which needs a more
suitable
name, porthole perhaps?), provides a means of translating guest 
physical
addresses to host virtual addresses, and finally to the host offsets 
in RAM

for
file-backed memory guests. It does this by means of a simple protocol 
over a
unix socket (chardev) which is supplied the appropriate fd for the 
VM's

system
RAM. The guest (in this case, Windows), when presented with the 
address of a
userspace buffer and size, will mlock the appropriate pages into RAM 
and

pass
guest physical addresses to the virtual device.


Hi Geroggrey,
  I wonder if the same thing can be done by using the existing 
vhost-user

mechanism.

  vhost-user is intended for implementing a virtio device outside of 
the
qemu process; so it has a character device that qemu passes commands 
down

to the other process, where qemu mostly passes commands via the virtio
queues.   To be able to read the virtio queues, the external process
mmap's the same memory as the guest - it gets passed a 'set mem table'
command by qemu that includes fd's for the RAM, and includes 
base/offset

pairs saying that a particular chunk of RAM is mapped at a particular
guest physical address.

  Whether or not you make use of virtio queues, I think the mechanism
for the device to tell the external process the mappings might be what
you're after.

Dave



While normally I would be all for re-using such code, the vhost-user 
while

being very feature-complete from what I understand is overkill for our
requirements. It will still allocate a communication ring and an events 
system
that we will not be using. The goal of this device is to provide a dumb 
&
simple method of sharing system ram, both for this project and for 
others that
work on a simple polling mechanism, it is not intended to be an 
end-to-end

solution like vhost-user is.

If you still believe that vhost-user should be used, I will do what I 
can to
implement it, but for such a simple device I honestly believe it is 
overkill.


-Geoff

This device and the windows driver have been designed in such a way 
that

it's a
utility device for any project and/or application that could make use 
of it.
The PCI subsystem vendor and device ID are used to provide a means of 
device
identification in cases where multiple devices may be in use for 
differing
applications. This also allows one common driver to be used for any 
other

projects wishing to build on this device.

My ultimate goal is to get this to a state where it could be accepted
upstream
into Qemu at which point Looking Glass would be modified to use it 
instead

of
the IVSHMEM device.

My git repository with the new device can be found at:
https://github.com/gnif/qemu

The new device is:
https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c

Looking Glass:
https://looking-glass.hostfission.com/

The windows driver, while working, needs some cleanup before the 
source is
published. I intend to maintain both this device and the windows 
driver
including producing a signed Windows 10 driver if Redhat are 
unwilling or

unable.

Kind Regards,
Geoffrey McRae

HostFission
https://hostfission.com


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: RFC: New device for zero-copy VM memory access

2019-10-30 Thread geoff

Hi Dave,

On 2019-10-31 05:52, Dr. David Alan Gilbert wrote:

* ge...@hostfission.com (ge...@hostfission.com) wrote:

Hi All,

Over the past week, I have been working to come up with a solution to 
the
memory transfer performance issues that hinder the Looking Glass 
Project.


Currently Looking Glass works by using the IVSHMEM shared memory 
device

which
is fed by an application that captures the guest's video output. While 
this
works it is sub-optimal because we first have to perform a CPU copy of 
the
captured frame into shared RAM, and then back out again for display. 
Because
the destination buffers are allocated by closed proprietary code 
(DirectX,

or
NVidia NvFBC) there is no way to have the frame placed directly into 
the

IVSHMEM shared ram.

This new device, currently named `introspection` (which needs a more
suitable
name, porthole perhaps?), provides a means of translating guest 
physical
addresses to host virtual addresses, and finally to the host offsets 
in RAM

for
file-backed memory guests. It does this by means of a simple protocol 
over a
unix socket (chardev) which is supplied the appropriate fd for the 
VM's

system
RAM. The guest (in this case, Windows), when presented with the 
address of a
userspace buffer and size, will mlock the appropriate pages into RAM 
and

pass
guest physical addresses to the virtual device.


Hi Geroggrey,
  I wonder if the same thing can be done by using the existing 
vhost-user

mechanism.

  vhost-user is intended for implementing a virtio device outside of 
the
qemu process; so it has a character device that qemu passes commands 
down

to the other process, where qemu mostly passes commands via the virtio
queues.   To be able to read the virtio queues, the external process
mmap's the same memory as the guest - it gets passed a 'set mem table'
command by qemu that includes fd's for the RAM, and includes 
base/offset

pairs saying that a particular chunk of RAM is mapped at a particular
guest physical address.

  Whether or not you make use of virtio queues, I think the mechanism
for the device to tell the external process the mappings might be what
you're after.

Dave



While normally I would be all for re-using such code, the vhost-user 
while

being very feature-complete from what I understand is overkill for our
requirements. It will still allocate a communication ring and an events 
system
that we will not be using. The goal of this device is to provide a dumb 
&
simple method of sharing system ram, both for this project and for 
others that
work on a simple polling mechanism, it is not intended to be an 
end-to-end

solution like vhost-user is.

If you still believe that vhost-user should be used, I will do what I 
can to
implement it, but for such a simple device I honestly believe it is 
overkill.


-Geoff

This device and the windows driver have been designed in such a way 
that

it's a
utility device for any project and/or application that could make use 
of it.
The PCI subsystem vendor and device ID are used to provide a means of 
device
identification in cases where multiple devices may be in use for 
differing
applications. This also allows one common driver to be used for any 
other

projects wishing to build on this device.

My ultimate goal is to get this to a state where it could be accepted
upstream
into Qemu at which point Looking Glass would be modified to use it 
instead

of
the IVSHMEM device.

My git repository with the new device can be found at:
https://github.com/gnif/qemu

The new device is:
https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c

Looking Glass:
https://looking-glass.hostfission.com/

The windows driver, while working, needs some cleanup before the 
source is
published. I intend to maintain both this device and the windows 
driver
including producing a signed Windows 10 driver if Redhat are unwilling 
or

unable.

Kind Regards,
Geoffrey McRae

HostFission
https://hostfission.com


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: RFC: New device for zero-copy VM memory access

2019-10-30 Thread geoff

The windows driver source is now available also.

https://github.com/gnif/Porthole-Guest-Driver

I have also opted to rename the device to 'porthole', hopefully this 
name is

acceptable.

On 2019-10-30 09:53, ge...@hostfission.com wrote:
Just to follow this up, here is a sample client application for this 
device


https://gist.github.com/gnif/77e7fb54604b42a1a98ecb8bf3d2cf46

-Geoff

On 2019-10-30 01:31, ge...@hostfission.com wrote:

Hi All,

Over the past week, I have been working to come up with a solution to 
the
memory transfer performance issues that hinder the Looking Glass 
Project.


Currently Looking Glass works by using the IVSHMEM shared memory 
device which
is fed by an application that captures the guest's video output. While 
this
works it is sub-optimal because we first have to perform a CPU copy of 
the
captured frame into shared RAM, and then back out again for display. 
Because
the destination buffers are allocated by closed proprietary code 
(DirectX, or
NVidia NvFBC) there is no way to have the frame placed directly into 
the

IVSHMEM shared ram.

This new device, currently named `introspection` (which needs a more 
suitable
name, porthole perhaps?), provides a means of translating guest 
physical
addresses to host virtual addresses, and finally to the host offsets 
in RAM for
file-backed memory guests. It does this by means of a simple protocol 
over a
unix socket (chardev) which is supplied the appropriate fd for the 
VM's system
RAM. The guest (in this case, Windows), when presented with the 
address of a
userspace buffer and size, will mlock the appropriate pages into RAM 
and pass

guest physical addresses to the virtual device.

This device and the windows driver have been designed in such a way 
that it's a
utility device for any project and/or application that could make use 
of it.
The PCI subsystem vendor and device ID are used to provide a means of 
device
identification in cases where multiple devices may be in use for 
differing
applications. This also allows one common driver to be used for any 
other

projects wishing to build on this device.

My ultimate goal is to get this to a state where it could be accepted 
upstream
into Qemu at which point Looking Glass would be modified to use it 
instead of

the IVSHMEM device.

My git repository with the new device can be found at:
https://github.com/gnif/qemu

The new device is:
https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c

Looking Glass:
https://looking-glass.hostfission.com/

The windows driver, while working, needs some cleanup before the 
source is
published. I intend to maintain both this device and the windows 
driver
including producing a signed Windows 10 driver if Redhat are unwilling 
or

unable.

Kind Regards,
Geoffrey McRae

HostFission
https://hostfission.com




Re: RFC: New device for zero-copy VM memory access

2019-10-29 Thread geoff
Just to follow this up, here is a sample client application for this 
device


https://gist.github.com/gnif/77e7fb54604b42a1a98ecb8bf3d2cf46

-Geoff

On 2019-10-30 01:31, ge...@hostfission.com wrote:

Hi All,

Over the past week, I have been working to come up with a solution to 
the
memory transfer performance issues that hinder the Looking Glass 
Project.


Currently Looking Glass works by using the IVSHMEM shared memory device 
which
is fed by an application that captures the guest's video output. While 
this
works it is sub-optimal because we first have to perform a CPU copy of 
the
captured frame into shared RAM, and then back out again for display. 
Because
the destination buffers are allocated by closed proprietary code 
(DirectX, or
NVidia NvFBC) there is no way to have the frame placed directly into 
the

IVSHMEM shared ram.

This new device, currently named `introspection` (which needs a more 
suitable
name, porthole perhaps?), provides a means of translating guest 
physical
addresses to host virtual addresses, and finally to the host offsets in 
RAM for
file-backed memory guests. It does this by means of a simple protocol 
over a
unix socket (chardev) which is supplied the appropriate fd for the VM's 
system
RAM. The guest (in this case, Windows), when presented with the address 
of a
userspace buffer and size, will mlock the appropriate pages into RAM 
and pass

guest physical addresses to the virtual device.

This device and the windows driver have been designed in such a way 
that it's a
utility device for any project and/or application that could make use 
of it.
The PCI subsystem vendor and device ID are used to provide a means of 
device
identification in cases where multiple devices may be in use for 
differing
applications. This also allows one common driver to be used for any 
other

projects wishing to build on this device.

My ultimate goal is to get this to a state where it could be accepted 
upstream
into Qemu at which point Looking Glass would be modified to use it 
instead of

the IVSHMEM device.

My git repository with the new device can be found at:
https://github.com/gnif/qemu

The new device is:
https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c

Looking Glass:
https://looking-glass.hostfission.com/

The windows driver, while working, needs some cleanup before the source 
is

published. I intend to maintain both this device and the windows driver
including producing a signed Windows 10 driver if Redhat are unwilling 
or

unable.

Kind Regards,
Geoffrey McRae

HostFission
https://hostfission.com




RFC: New device for zero-copy VM memory access

2019-10-29 Thread geoff

Hi All,

Over the past week, I have been working to come up with a solution to 
the
memory transfer performance issues that hinder the Looking Glass 
Project.


Currently Looking Glass works by using the IVSHMEM shared memory device 
which
is fed by an application that captures the guest's video output. While 
this
works it is sub-optimal because we first have to perform a CPU copy of 
the
captured frame into shared RAM, and then back out again for display. 
Because
the destination buffers are allocated by closed proprietary code 
(DirectX, or

NVidia NvFBC) there is no way to have the frame placed directly into the
IVSHMEM shared ram.

This new device, currently named `introspection` (which needs a more 
suitable

name, porthole perhaps?), provides a means of translating guest physical
addresses to host virtual addresses, and finally to the host offsets in 
RAM for
file-backed memory guests. It does this by means of a simple protocol 
over a
unix socket (chardev) which is supplied the appropriate fd for the VM's 
system
RAM. The guest (in this case, Windows), when presented with the address 
of a
userspace buffer and size, will mlock the appropriate pages into RAM and 
pass

guest physical addresses to the virtual device.

This device and the windows driver have been designed in such a way that 
it's a
utility device for any project and/or application that could make use of 
it.
The PCI subsystem vendor and device ID are used to provide a means of 
device
identification in cases where multiple devices may be in use for 
differing
applications. This also allows one common driver to be used for any 
other

projects wishing to build on this device.

My ultimate goal is to get this to a state where it could be accepted 
upstream
into Qemu at which point Looking Glass would be modified to use it 
instead of

the IVSHMEM device.

My git repository with the new device can be found at:
https://github.com/gnif/qemu

The new device is:
https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c

Looking Glass:
https://looking-glass.hostfission.com/

The windows driver, while working, needs some cleanup before the source 
is

published. I intend to maintain both this device and the windows driver
including producing a signed Windows 10 driver if Redhat are unwilling 
or

unable.

Kind Regards,
Geoffrey McRae

HostFission
https://hostfission.com



Re: Adding a memory alias breaks v-rings

2019-10-24 Thread geoff--- via

Hi Phil

On 2019-10-24 22:07, Philippe Mathieu-Daudé wrote:

Hi Geoffrey,

On 10/24/19 10:27 AM, ge...@hostfission.com wrote:

Hi All,

I have been working on adding a feature as a proof of concept to 
improve the performance of applications like Looking Glass by avoiding 
additional memory copies. My goal is to alias part of the IVSHMEM 
shared memory over a pointer provided by the guest OS capture API 
(DXGI Desktop Duplication or NVIDIA Frame Buffer Capture). I have 
managed to get this working by adding a few additional configuration 
registers to the IVSHMEM device and enhanced the IVSHMEM windows 
driver with suitable IOCTLs to set this all up. While this concept is 
backwards it needs to work this way as we do not have control over the 
destination buffer allocation by the GPU driver.


This all works, however, it has exposed a bug (or I am doing things 
improperly) with the way that vhost tracks memory. When calling 
memory_region_add_subregion_overlap the memory listener in vhost fires 
triggering vhost_region_add_section. According to the comments this 
code depends on being called in memory address order, but because I am 
adding the alias region late, it's out of order, and also splits the 
upper memory region. This has the effect of corrupting/breaking one or 
more random vrings, as evidenced by the crash/hang of vhost-net or 
other virtio devices.


I'm not sure this is the same issue I had before, but you might
find Frederic and Alexey suggestions from this thread helpful:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg525833.html

Also note vhost_region_add_section() you mentioned has this comment:

if (need_add) {
...
/* The flatview isn't stable and we don't use it, making it 
NULL

 * means we can memcmp the list.
 */
dev->tmp_sections[dev->n_tmp_sections - 1].fv = NULL;

Maybe you need this change:

-- >8 --
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -642,6 +642,7 @@ static void vhost_region_add_section(struct 
vhost_dev *dev,

  */
 dev->tmp_sections[dev->n_tmp_sections - 1].fv = NULL;
 memory_region_ref(section->mr);
+memory_region_update_container_subregions(section->mr);
 }
 }

---


Unfortunately not, `memory_region_update_container_subregions` is 
private in memory.c hangs the VM even if I expose it and call it as you 
suggested. It is also already called via 
memory_region_add_subregion_overlap anyway.


Thanks for the suggestion though :)



Regards,

Phil.


The following and errors are also logged regarding section alignment:

qemu-system-x86_64: vhost_region_add_section:Section rounded to 
3c000 prior to previous 3fc4f9000
qemu-system-x86_64: vhost_region_add_section:Section rounded to 
3c000 prior to previous 3fc4f9000


Here is the flat view after the alias has been added.

   0001-0003fc4f8fff (prio 0, ram): mem 
@8000 kvm

   0003fc4f9000-0003fc4f9fff (prio 1, ram): ivshmem kvm
   0003fc4fa000-00043fff (prio 0, ram): mem 
@00037c4fa000 kvm


When the guest doesn't crash out due to the obvious corruption it is 
possible to verify that the alias is in the right place and fully 
functional. Unfortunately, I simply do not have enough of a grasp on 
vhost to understand exactly what is going on and how to correct it.


Getting this feature working is highly desirable as it should be 
possible to obtain GPU -> GPU memory transfers between guests without 
requiring workstation/professional graphics cards.


Kind Regards,
Geoffrey McRae





Adding a memory alias breaks v-rings

2019-10-24 Thread geoff

Hi All,

I have been working on adding a feature as a proof of concept to improve 
the performance of applications like Looking Glass by avoiding 
additional memory copies. My goal is to alias part of the IVSHMEM shared 
memory over a pointer provided by the guest OS capture API (DXGI Desktop 
Duplication or NVIDIA Frame Buffer Capture). I have managed to get this 
working by adding a few additional configuration registers to the 
IVSHMEM device and enhanced the IVSHMEM windows driver with suitable 
IOCTLs to set this all up. While this concept is backwards it needs to 
work this way as we do not have control over the destination buffer 
allocation by the GPU driver.


This all works, however, it has exposed a bug (or I am doing things 
improperly) with the way that vhost tracks memory. When calling 
memory_region_add_subregion_overlap the memory listener in vhost fires 
triggering vhost_region_add_section. According to the comments this code 
depends on being called in memory address order, but because I am adding 
the alias region late, it's out of order, and also splits the upper 
memory region. This has the effect of corrupting/breaking one or more 
random vrings, as evidenced by the crash/hang of vhost-net or other 
virtio devices. The following and errors are also logged regarding 
section alignment:


qemu-system-x86_64: vhost_region_add_section:Section rounded to 
3c000 prior to previous 3fc4f9000
qemu-system-x86_64: vhost_region_add_section:Section rounded to 
3c000 prior to previous 3fc4f9000


Here is the flat view after the alias has been added.

  0001-0003fc4f8fff (prio 0, ram): mem @8000 
kvm

  0003fc4f9000-0003fc4f9fff (prio 1, ram): ivshmem kvm
  0003fc4fa000-00043fff (prio 0, ram): mem @00037c4fa000 
kvm


When the guest doesn't crash out due to the obvious corruption it is 
possible to verify that the alias is in the right place and fully 
functional. Unfortunately, I simply do not have enough of a grasp on 
vhost to understand exactly what is going on and how to correct it.


Getting this feature working is highly desirable as it should be 
possible to obtain GPU -> GPU memory transfers between guests without 
requiring workstation/professional graphics cards.


Kind Regards,
Geoffrey McRae



Re: [Qemu-devel] [RFC for-3.2 PATCH 0/7] pcie: Enhanced link speed and width support

2018-11-14 Thread geoff--- via Qemu-devel
I can confirm that these patches work as expected. Thank you kindly Alex 
for your hard work!


Tested-by: Geoffrey McRae 

On 2018-11-15 07:50, Alex Williamson wrote:

QEMU exposes gen1 PCI-express interconnect devices supporting only
2.5GT/s and x1 width.  It might not seem obvious that a virtual
bandwidth limitation can result in a real performance degradation, but
it's been reported that in some configurations assigned GPUs might not
scale their link speed up to the maximum supported value if the
downstream port above it only advertises limited link support.

As proposed[1] this series effectively implements virtual link
negotiation on downstream ports and enhances the generic PCIe root
port to allow user configurable speeds and widths.  The "negotiation"
simply mirrors the link status of the connected downstream device
providing the appearance of dynamic link speed scaling to match the
endpoint device.  Not yet implemented from the proposal is support
for globally updating defaults based on machine type, though the
foundation is provided here by allowing supporting PCIESlots to
implement an instance_init callback which can call into a common
helper for this.

I have not specifically tested migration with this, but we already
consider LNKSTA to be dynamic and the other changes implemented here
are static config space changes with no changes being implemented for
devices using default values, ie. they should be compatible by virtue
of existing config space migration support.

I think I've covered the required link related registers to support
PCIe 4.0, but please let me know if I've missed any.

Testing and feedback appreciated, patch 6/7 provides example qemu:arg
options and requirements to use with existing libvirt.  Native libvirt
support TBD.  Thanks,

Alex

[1] https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03086.html

---

Alex Williamson (7):
  pcie: Create enums for link speed and width
  pci: Sync PCIe downstream port LNKSTA on read
  qapi: Define PCIe link speed and width properties
  pcie: Add link speed and width fields to PCIESlot
  pcie: Fill PCIESlot link fields to support higher speeds and 
widths
  pcie: Allow generic PCIe root port to specify link speed and 
width

  vfio/pci: Remove PCIe Link Status emulation


 hw/core/qdev-properties.c  |  178 


 hw/pci-bridge/gen_pcie_root_port.c |2
 hw/pci-bridge/pcie_root_port.c |   14 +++
 hw/pci/pci.c   |4 +
 hw/pci/pcie.c  |  118 +++-
 hw/vfio/pci.c  |9 --
 include/hw/pci/pci.h   |   13 +++
 include/hw/pci/pcie.h  |1
 include/hw/pci/pcie_port.h |4 +
 include/hw/pci/pcie_regs.h |   23 -
 include/hw/qdev-properties.h   |8 ++
 qapi/common.json   |   42 
 12 files changed, 404 insertions(+), 12 deletions(-)




Re: [Qemu-devel] [PATCH 1/2] ps2: Clear the queue on PS/2 mouse reset and obey device disable

2018-05-07 Thread geoff--- via Qemu-devel

On 2018-05-07 22:41, Gerd Hoffmann wrote:
On Mon, May 07, 2018 at 10:26:24PM +1000, geoff--- via Qemu-devel 
wrote:

On 2018-05-07 22:21, Gerd Hoffmann wrote:
> On Mon, May 07, 2018 at 10:00:22PM +1000, geoff--- via Qemu-devel wrote:
> > This allows guest's to correctly reinitialize and identify the mouse
> > should the guest decide to re-scan or reset during mouse input events.
> >
> > Signed-off-by: Geoffrey McRae 
> > ---
> >  hw/input/ps2.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/hw/input/ps2.c b/hw/input/ps2.c
> > index 06f5d2ac4a..6edf046820 100644
> > --- a/hw/input/ps2.c
> > +++ b/hw/input/ps2.c
> > @@ -673,6 +673,9 @@ static void ps2_mouse_sync(DeviceState *dev)
> >  {
> >  PS2MouseState *s = (PS2MouseState *)dev;
> >
> > +if (!(s->mouse_status & MOUSE_STATUS_ENABLED))
> > +return;
> > +
>
> Why this is needed?

To quote: 
https://wiki.osdev.org/%228042%22_PS/2_Controller#Detecting_PS.2F2_Device_Types


The device should respond to the "identify" command by sending a 
sequence of

none, one or two identification bytes. However, if you just send the
"identify" command you can't prevent the response from the "identify"
command from being mixed up with keyboard/mouse data. To fix this 
problem,
you need to send the "disable scanning" command first. Disabling 
scanning
means that the device ignores the user (e.g. keyboards ignore 
keypresses,
mice ignore mouse movement and button presses, etc) and won't send 
data to

mess your device identification code up.


Ok.  Same check should be added to ps2_keyboard_event() then I guess?


Quite correct, I will include this in the next patch set.



cheers,
  Gerd





Re: [Qemu-devel] [PATCH 2/2] ps2: Fix mouse stream corruption due to lost data

2018-05-07 Thread geoff--- via Qemu-devel

On 2018-05-07 22:34, Gerd Hoffmann wrote:

diff --git a/hw/input/ps2.c b/hw/input/ps2.c
index 6edf046820..011290920f 100644
--- a/hw/input/ps2.c
+++ b/hw/input/ps2.c
@@ -192,12 +192,50 @@ void ps2_queue(PS2State *s, int b)
 {
 PS2Queue *q = &s->queue;

-if (q->count >= PS2_QUEUE_SIZE - 1)
+if (q->count == PS2_QUEUE_SIZE)
+{
+printf("Warning! PS2 Queue Overflow!\n");
 return;
+}


Leftover debug printf?


Correct :), I will remove it.




+void ps2_raise(PS2State *s)
+{
+s->update_irq(s->update_arg, 1);
+}
+
+void ps2_queue_raise(PS2State *s, int b)
+{
+ps2_queue(s, b);
+s->update_irq(s->update_arg, 1);
+}


I'd suggest to keep the ps2_queue() name.  Makes the patch much smaller
and easier to review.  Factor out the code to actually queue things to
a new ps2_queue_noirq() function.


+void ps2_queue_bytes(PS2State *s, const int length, ...)


Ack.



I'd prefer to not use vaargs here as gcc can't check the arguments 
then.

Suggest to just have ps2_queue_{2,3,4}() helpers instead to queue
multibyte messages.


Ack.



cheers,
  Gerd


Thanks,
  Geoff



Re: [Qemu-devel] [PATCH 1/2] ps2: Clear the queue on PS/2 mouse reset and obey device disable

2018-05-07 Thread geoff--- via Qemu-devel

On 2018-05-07 22:21, Gerd Hoffmann wrote:
On Mon, May 07, 2018 at 10:00:22PM +1000, geoff--- via Qemu-devel 
wrote:

This allows guest's to correctly reinitialize and identify the mouse
should the guest decide to re-scan or reset during mouse input events.

Signed-off-by: Geoffrey McRae 
---
 hw/input/ps2.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/input/ps2.c b/hw/input/ps2.c
index 06f5d2ac4a..6edf046820 100644
--- a/hw/input/ps2.c
+++ b/hw/input/ps2.c
@@ -673,6 +673,9 @@ static void ps2_mouse_sync(DeviceState *dev)
 {
 PS2MouseState *s = (PS2MouseState *)dev;

+if (!(s->mouse_status & MOUSE_STATUS_ENABLED))
+return;
+


Why this is needed?


To quote: 
https://wiki.osdev.org/%228042%22_PS/2_Controller#Detecting_PS.2F2_Device_Types


The device should respond to the "identify" command by sending a 
sequence of none, one or two identification bytes. However, if you just 
send the "identify" command you can't prevent the response from the 
"identify" command from being mixed up with keyboard/mouse data. To fix 
this problem, you need to send the "disable scanning" command first. 
Disabling scanning means that the device ignores the user (e.g. 
keyboards ignore keypresses, mice ignore mouse movement and button 
presses, etc) and won't send data to mess your device identification 
code up.





@@ -776,6 +779,7 @@ void ps2_write_mouse(void *opaque, int val)
 s->mouse_resolution = 2;
 s->mouse_status = 0;
 s->mouse_type = 0;
+ps2_reset_queue(&s->common);


Looks good.

cheers,
  Gerd





[Qemu-devel] [PATCH 2/2] ps2: Fix mouse stream corruption due to lost data

2018-05-07 Thread geoff--- via Qemu-devel

This fixes an issue by adding bounds checking to multi-byte packets
where the PS/2 mouse data stream may become corrupted due to data
being discarded when the PS/2 ringbuffer is full.

Interrupts for Multi-byte responses are postponed until the final
byte has been queued.

These changes fix a bug where windows guests drop the mouse device
entirely requring the guest to be restarted.

Signed-off-by: Geoffrey McRae 
---
 hw/input/pckbd.c |   6 +--
 hw/input/ps2.c   | 160 
+--

 2 files changed, 110 insertions(+), 56 deletions(-)

diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
index f17f18e51b..004ea3466d 100644
--- a/hw/input/pckbd.c
+++ b/hw/input/pckbd.c
@@ -216,9 +216,9 @@ static uint64_t kbd_read_status(void *opaque, hwaddr 
addr,

 static void kbd_queue(KBDState *s, int b, int aux)
 {
 if (aux)
-ps2_queue(s->mouse, b);
+ps2_queue_raise(s->mouse, b);
 else
-ps2_queue(s->kbd, b);
+ps2_queue_raise(s->kbd, b);
 }

 static void outport_write(KBDState *s, uint32_t val)
diff --git a/hw/input/ps2.c b/hw/input/ps2.c
index 6edf046820..011290920f 100644
--- a/hw/input/ps2.c
+++ b/hw/input/ps2.c
@@ -192,12 +192,50 @@ void ps2_queue(PS2State *s, int b)
 {
 PS2Queue *q = &s->queue;

-if (q->count >= PS2_QUEUE_SIZE - 1)
+if (q->count == PS2_QUEUE_SIZE)
+{
+printf("Warning! PS2 Queue Overflow!\n");
 return;
+}
+
 q->data[q->wptr] = b;
 if (++q->wptr == PS2_QUEUE_SIZE)
 q->wptr = 0;
 q->count++;
+}
+
+void ps2_raise(PS2State *s)
+{
+s->update_irq(s->update_arg, 1);
+}
+
+void ps2_queue_raise(PS2State *s, int b)
+{
+ps2_queue(s, b);
+s->update_irq(s->update_arg, 1);
+}
+
+void ps2_queue_bytes(PS2State *s, const int length, ...)
+{
+PS2Queue *q = &s->queue;
+
+if (PS2_QUEUE_SIZE - q->count < length) {
+printf("Unable to send %d bytes, buffer full\n", length);
+return;
+}
+
+va_list args;
+va_start(args, length);
+
+for(int i = 0; i < length; ++i)
+{
+q->data[q->wptr] = va_arg(args, int);
+if (++q->wptr == PS2_QUEUE_SIZE)
+q->wptr = 0;
+q->count++;
+}
+
+va_end(args);
 s->update_irq(s->update_arg, 1);
 }

@@ -213,13 +251,13 @@ static void ps2_put_keycode(void *opaque, int 
keycode)

 if (keycode == 0xf0) {
 s->need_high_bit = true;
 } else if (s->need_high_bit) {
-ps2_queue(&s->common, translate_table[keycode] | 0x80);
+ps2_queue_raise(&s->common, translate_table[keycode] | 
0x80);

 s->need_high_bit = false;
 } else {
-ps2_queue(&s->common, translate_table[keycode]);
+ps2_queue_raise(&s->common, translate_table[keycode]);
 }
 } else {
-ps2_queue(&s->common, keycode);
+ps2_queue_raise(&s->common, keycode);
 }
 }

@@ -490,72 +528,80 @@ void ps2_write_keyboard(void *opaque, int val)
 case -1:
 switch(val) {
 case 0x00:
-ps2_queue(&s->common, KBD_REPLY_ACK);
+ps2_queue_raise(&s->common, KBD_REPLY_ACK);
 break;
 case 0x05:
-ps2_queue(&s->common, KBD_REPLY_RESEND);
+ps2_queue_raise(&s->common, KBD_REPLY_RESEND);
 break;
 case KBD_CMD_GET_ID:
-ps2_queue(&s->common, KBD_REPLY_ACK);
 /* We emulate a MF2 AT keyboard here */
-ps2_queue(&s->common, KBD_REPLY_ID);
 if (s->translate)
-ps2_queue(&s->common, 0x41);
+ps2_queue_bytes(&s->common, 3,
+KBD_REPLY_ACK,
+KBD_REPLY_ID,
+0x41);
 else
-ps2_queue(&s->common, 0x83);
+ps2_queue_bytes(&s->common, 3,
+KBD_REPLY_ACK,
+KBD_REPLY_ID,
+0x83);
 break;
 case KBD_CMD_ECHO:
-ps2_queue(&s->common, KBD_CMD_ECHO);
+ps2_queue_raise(&s->common, KBD_CMD_ECHO);
 break;
 case KBD_CMD_ENABLE:
 s->scan_enabled = 1;
-ps2_queue(&s->common, KBD_REPLY_ACK);
+ps2_queue_raise(&s->common, KBD_REPLY_ACK);
 break;
 case KBD_CMD_SCANCODE:
 case KBD_CMD_SET_LEDS:
 case KBD_CMD_SET_RATE:
 s->common.write_cmd = val;
-ps2_queue(&s->common, KBD_REPLY_ACK);
+ps2_queue_raise(&s->common, KBD_REPLY_ACK);
 break;
 case KBD_CMD_RESET_DISABLE:
 ps2_reset_keyboard(s);
 s->scan_enabled = 0;
-ps2_queue(&s->common, KBD_REPLY_ACK);
+ps2_queue_raise(&s->common, KBD_REPLY_ACK);
 break;
 case KBD_CMD_RESET_ENABLE:
 ps2_reset_keyboard(s);
 s->scan_enabled = 1;
-ps2_queue(&s->common, KBD_REPLY_ACK);
+ps2_queue_ra

[Qemu-devel] [PATCH 1/2] ps2: Clear the queue on PS/2 mouse reset and obey device disable

2018-05-07 Thread geoff--- via Qemu-devel

This allows guest's to correctly reinitialize and identify the mouse
should the guest decide to re-scan or reset during mouse input events.

Signed-off-by: Geoffrey McRae 
---
 hw/input/ps2.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/input/ps2.c b/hw/input/ps2.c
index 06f5d2ac4a..6edf046820 100644
--- a/hw/input/ps2.c
+++ b/hw/input/ps2.c
@@ -673,6 +673,9 @@ static void ps2_mouse_sync(DeviceState *dev)
 {
 PS2MouseState *s = (PS2MouseState *)dev;

+if (!(s->mouse_status & MOUSE_STATUS_ENABLED))
+return;
+
 if (s->mouse_buttons) {
 qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
 }
@@ -776,6 +779,7 @@ void ps2_write_mouse(void *opaque, int val)
 s->mouse_resolution = 2;
 s->mouse_status = 0;
 s->mouse_type = 0;
+ps2_reset_queue(&s->common);
 ps2_queue(&s->common, AUX_ACK);
 ps2_queue(&s->common, 0xaa);
 ps2_queue(&s->common, s->mouse_type);
--
2.14.2




Re: [Qemu-devel] [PATCH v7 0/9] i386: Enable TOPOEXT to support hyperthreading on AMD CPU

2018-04-26 Thread geoff--- via Qemu-devel

Works well for me, thanks!

Tested-by: Geoffrey McRae 

On 2018-04-27 02:26, Babu Moger wrote:
This series enables the TOPOEXT feature for AMD CPUs. This is required 
to

support hyperthreading on kvm guests.

This addresses the issues reported in these bugs:
https://bugzilla.redhat.com/show_bug.cgi?id=1481253
https://bugs.launchpad.net/qemu/+bug/1703506

v7:
 Rebased on top of latest tree after 2.12 release and done few basic
tests. There are
 no changes except for few minor hunks. Hopefully this gets pulled
into 2.13 release.
 Please review, let me know of any feedback.

v6:
1.Fixed problem with patch#4(Add new property to control cache info).
The parameter
 legacy_cache should be "on" by default on machine type "pc-q35-2.10". 
This was

 found by Alexandr Iarygin.
2.Fixed the l3 cache size for EPYC based machines(patch#3). Also,
fixed the number of
 logical processors sharing the cache(patch#6). Only L3 cache is
shared by multiple
 cores but not L1 or L2. This was a bug while decoding. This was found
by Geoffrey McRae
 and he verified the fix.

v5:
 In this series I tried to address the feedback from Eduardo Habkost.
 The discussion thread is here.
 https://patchwork.kernel.org/patch/10299745/
 The previous thread is here.
 http://patchwork.ozlabs.org/cover/884885/

Reason for these changes.
 The cache properties for AMD family of processors have changed from
 previous releases. We don't want to display the new information on the
 old family of processors as this might cause compatibility issues.

Changes:
1.Based the patches on top of Eduardo's(patch#1) patch.
  Changed few things.
  Moved the Cache definitions to cpu.h file.
  Changed the CPUID_4 names to generic names.
2.Added a new propery "legacy-cache" in cpu object(patch#2). This can 
be
  used to display the old property even if the host supports the new 
cache

  properties.
3.Added cache information in X86CPUDefinition and CPUX86State
4.Patch 6-7 changed quite a bit from previous version does to new 
approach.

5.Addressed few issues with CPUID_8000_001d and CPUID_8000_001E.

v4:
1.Removed the checks under cpuid 0x801D leaf(patch #2). These check 
are

  not necessary. Found this during internal review.
2.Added CPUID_EXT3_TOPOEXT feature for all the 17 family(patch #4). 
This was

  found by Kash Pande during his testing.
3.Removed th hardcoded cpuid xlevel and dynamically extended if
CPUID_EXT3_TOPOEXT
  is supported(Suggested by Brijesh Singh).

v3:
1.Removed the patch #1. Radim mentioned that original typo problem is 
in

  linux kernel header. qemu is just copying those files.
2.In previous version, I used the cpuid 4 definitions for AMDs cpuid 
leaf
  0x801D. CPUID 4 is very intel specific and we dont want to expose 
those
  details under AMD. I have renamed some of these definitions as 
generic.
  These changes are in patch#1. Radim, let me know if this is what you 
intended.

3.Added assert to for core_id(Suggested by Radim Krčmář).
4.Changed the if condition under "L3 cache info"(Suggested by Gary 
Hook).

5.Addressed few more text correction and code cleanup(Suggested by
Thomas Lendacky).

v2:
  Fixed few more minor issues per Gary Hook's comments. Thank you Gary.
  Removed the patch#1. We need to handle the instruction cache 
associativity
  seperately. It varies based on the cpu family. I will comeback to 
that later.

  Added two more typo corrections in patch#1 and patch#5.

v1:
  Stanislav Lanci posted few patches earlier.
  https://patchwork.kernel.org/patch/10040903/

Rebased his patches with few changes.
1.Spit the patches into two, separating cpuid functions
  0x801D and 0x801E (Patch 2 and 3).
2.Removed the generic non-intel check and made a separate patch
  with some changes(Patch 5).
3.Fixed L3_N_SETS_AMD(from 4096 to 8192) based on 
CPUID_Fn801D_ECX_x03.


Added 2 more patches.
Patch 1. Fixes cache associativity.
Patch 4. Adds TOPOEXT feature on AMD EPYC CPU.


Babu Moger (8):
  i386: Add cache information in X86CPUDefinition
  i386: Initialize cache information for EPYC family processors
  i386: Add new property to control cache info
  i386: Use the statically loaded cache definitions
  i386: Populate AMD Processor Cache Information for cpuid 0x801D
  i386: Add support for CPUID_8000_001E for AMD
  i386: Enable TOPOEXT feature on AMD EPYC CPU
  i386: Remove generic SMT thread check

Eduardo Habkost (1):
  i386: Helpers to encode cache information consistently

 include/hw/i386/pc.h |   4 +
 target/i386/cpu.c| 736 
++-

 target/i386/cpu.h|  66 +
 target/i386/kvm.c|  29 +-
 4 files changed, 702 insertions(+), 133 deletions(-)





[Qemu-devel] Linux Guest Memory Performance

2018-02-03 Thread geoff--- via Qemu-devel

Hi All,

I am having some very strange issues with Qemu and memory copy 
performance. It seems that when performing buffer -> buffer copies of 
8MB or lower the performance is horrid.


Test program:

#include 
#include 
#include 
#include 
#include 

static inline uint64_t nanotime()
{
  struct timespec time;
  clock_gettime(CLOCK_MONOTONIC_RAW, &time);
  return ((uint64_t)time.tv_sec * 1e9) + time.tv_nsec;
}

int main(int argc, char * argv[])
{
  const int s = atoi(argv[1]);
  int size = s * 1024 * 1024;
  char * buffer1 = malloc(size);
  char * buffer2 = malloc(size);

  uint64_t t = nanotime();
  for(int i = 0; i < 1000; ++i)
memcpy(buffer1, buffer2, size);
  printf("%2u MB = %f ms\n", s, ((float)(nanotime() - t) / 1000.0f)
  / 100.0f);

  free(buffer1);
  free(buffer2);
  return 0;
}

Compiled with: gcc main.c -O3

Native Output:

#  for I in `seq 1 32`; do ./a.out $I; done
 1 MB = 0.026123 ms
 2 MB = 0.048406 ms
 3 MB = 0.073877 ms
 4 MB = 0.096974 ms
 5 MB = 0.115063 ms
 6 MB = 0.139025 ms
 7 MB = 0.163888 ms
 8 MB = 0.187360 ms
 9 MB = 0.203941 ms
10 MB = 0.227855 ms
11 MB = 0.251903 ms
12 MB = 0.279699 ms
13 MB = 0.296424 ms
14 MB = 0.315042 ms
15 MB = 0.340979 ms
16 MB = 0.358750 ms
17 MB = 0.382865 ms
18 MB = 0.403458 ms
19 MB = 0.426864 ms
20 MB = 0.448165 ms
21 MB = 0.473857 ms
22 MB = 0.493515 ms
23 MB = 0.520299 ms
24 MB = 0.538550 ms
25 MB = 0.566735 ms
26 MB = 0.588072 ms
27 MB = 0.612500 ms
28 MB = 0.633682 ms
29 MB = 0.659352 ms
30 MB = 0.690467 ms
31 MB = 0.698611 ms
32 MB = 0.721284 ms

Guest Output:

# for I in `seq 1 32`; do ./a.out $I; done
 1 MB = 0.026120 ms
 2 MB = 0.049053 ms
 3 MB = 0.081695 ms
 4 MB = 0.126873 ms
 5 MB = 0.161380 ms
 6 MB = 0.316972 ms
 7 MB = 0.492851 ms
 8 MB = 0.673696 ms
 9 MB = 0.221208 ms
10 MB = 0.256582 ms
11 MB = 0.276354 ms
12 MB = 0.316020 ms
13 MB = 0.327643 ms
14 MB = 0.363536 ms
15 MB = 0.382575 ms
16 MB = 0.401538 ms
17 MB = 0.436602 ms
18 MB = 0.473452 ms
19 MB = 0.491850 ms
20 MB = 0.527252 ms
21 MB = 0.546229 ms
22 MB = 0.561816 ms
23 MB = 0.582428 ms
24 MB = 0.614430 ms
25 MB = 0.660698 ms
26 MB = 0.670087 ms
27 MB = 0.688908 ms
28 MB = 0.714887 ms
29 MB = 0.746829 ms
30 MB = 0.763404 ms
31 MB = 0.780527 ms
32 MB = 0.821888 ms

Note that leading up to 8MB the copy is getting slower, but once the 
copy exceeds 8MB the copy is 3x faster.

Does anyone have any insight as to why this might be?

I am running master @ 11ed801d3df3c6e46b2f1f97dcfbf4ca3a2a2f4f
Host: AMD Thread Ripper 1950x

Guest launch parameters:

/usr/local/bin/qemu-system-x86_64 \
  -nographic \
  -runas geoff \
  -monitor stdio \
  -name guest=Aeryn,debug-threads=on \
  -machine q35,accel=kvm,usb=off,vmport=off,dump-guest-core=off \
  -cpu 
host,hv_time,hv_relaxed,hv_vapic,hv_vendor_id=lakeuv283713,kvm=off \
  -drive 
file=$DIR/ovmf/OVMF_CODE-pure-efi.fd,if=pflash,format=raw,unit=0,readonly=on 
\

  -drive file=$DIR/vars.fd,if=pflash,format=raw,unit=1 \
  -m 8192 \
  -mem-prealloc \
  -mem-path /dev/hugepages/aeryn \
  -realtime mlock=off \
  -smp 32,sockets=1,cores=16,threads=2 \
  -no-user-config \
  -nodefaults \
  -balloon none \
  \
  -global ICH9-LPC.disable_s3=1 \
  -global ICH9-LPC.disable_s4=1 \
  \
  -rtc base=localtime,driftfix=slew \
  -global kvm-pit.lost_tick_policy=discard \
  -no-hpet \
  \
  -boot strict=on \
  \
  -object iothread,id=iothread1 \
  -device virtio-scsi-pci,id=scsi1,iothread=iothread1 \
  -drive  if=none,id=hd0,file=/dev/moya/aeryn-efi,format=raw,aio=threads 
\

  -device scsi-hd,bus=scsi1.0,drive=hd0,bootindex=1 \
  -drive  
if=none,id=hd1,file=/dev/moya/aeryn-rootfs,format=raw,aio=threads \

  -device scsi-hd,bus=scsi1.0,drive=hd1 \
  \
  -netdev 
tap,script=/home/geoff/VM/bin/ovs-ifup,downscript=/home/geoff/VM/bin/ovs-ifdown,ifname=aeryn.10,id=hostnet0 
\
  -device 
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:06:12:34,bus=pcie.0 
\

  \
  -device intel-hda,id=sound0,bus=pcie.0 \
  -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 \
  \
  -device 
vfio-pci,host=0d:00.0,id=hostdev1,bus=pcie.0,addr=0x09,multifunction=on 
\

  -device vfio-pci,host=0d:00.1,id=hostdev2,bus=pcie.0,addr=0x09.1' \
  \
  -device ivshmem-plain,memdev=ivshmem \
  -object 
memory-backend-file,id=ivshmem,share=on,mem-path=/dev/shm/looking-glass,size=128M 
\

  \
  -msg timestamp=on \
  \
  -object 
input-linux,id=mou1,evdev=/dev/input/by-id/usb-Razer_Razer_DeathAdder_2013-event-mouse 
\
  -object 
input-linux,id=mou2,evdev=/dev/input/by-id/usb-Razer_Razer_DeathAdder_2013-if01-event-kbd 
\
  -object 
input-linux,id=mou3,evdev=/dev/input/by-id/usb-Razer_Razer_DeathAdder_2013-if02-event-kbd 
\
  -object 
input-linux,id=kbd1,evdev=/dev/input/by-id/usb-04d9_daskeyboard-event-kbd,grab_all=on,repeat=on 
\
  -object 
input-linux,id=kbd2,evdev=/dev/input/by-id/usb-04d9_daskeyboard-event-if01



Thanks in advance.



Re: [Qemu-devel] [PATCH v2 0/3] ivshmem: MSI bug fixes

2017-11-19 Thread geoff--- via Qemu-devel
I just updated to the latest build and applied this patch set, now on VM 
reset the qemu crashes with the following assert:


ivshmem.c:467: ivshmem_add_kvm_msi_virq: Assertion 
`!s->msi_vectors[vector].pdev' failed.


On 2017-11-15 18:31, Ladi Prosek wrote:

Fixes bugs in the ivshmem device implementation uncovered with the new
Windows ivshmem driver:
https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem

v1->v2:
* Patch 1 - added reproducer info to commit message (Markus)
* Patch 2 - restructured conditionals, fixed comment formatting 
(Markus)

* Patch 3 - added reproducer info to commit message (Markus)

Ladi Prosek (3):
  ivshmem: Don't update non-existent MSI routes
  ivshmem: Always remove irqfd notifiers
  ivshmem: Improve MSI irqfd error handling

 hw/misc/ivshmem.c | 77 
+--

 1 file changed, 58 insertions(+), 19 deletions(-)





Re: [Qemu-devel] [PATCH 3/3] ivshmem: Improve MSI irqfd error handling

2017-11-13 Thread geoff--- via Qemu-devel

On 2017-11-14 04:27, Markus Armbruster wrote:

Ladi Prosek  writes:


Adds a rollback path to ivshmem_enable_irqfd() and fixes
ivshmem_disable_irqfd() to bail if irqfd has not been enabled.

Signed-off-by: Ladi Prosek 


Is this a theoretical bug, or can you trigger it?


It is reproducible, I can trigger it by simply unloading the windows 
driver and then attempting to re-load it.


-Geoff



Re: [Qemu-devel] [PATCH 1/3] ivshmem: Don't update non-existent MSI routes

2017-11-10 Thread geoff--- via Qemu-devel
Thanks Ladi, I had not yet had time to dig into these, this patch set 
resolves all issues I was aware of.


Tested-by: Geoffrey McRae 

On 2017-11-11 04:34, Ladi Prosek wrote:
As of commit 660c97eef6f8 ("ivshmem: use kvm irqfd for msi 
notifications"),

QEMU crashes with:

  kvm_irqchip_commit_routes: Assertion `ret == 0' failed.

if the ivshmem device is configured with more vectors than what the 
server

supports. This is caused by the ivshmem_vector_unmask() being called on
vectors that have not been initialized by ivshmem_add_kvm_msi_virq().

This commit fixes it by adding a simple check to the mask and unmask
callbacks.

Note that the opposite mismatch, if the server supplies more vectors 
than
what the device is configured for, is already handled and leads to 
output

like:

  Too many eventfd received, device has 1 vectors

Fixes: 660c97eef6f8 ("ivshmem: use kvm irqfd for msi notifications")
Signed-off-by: Ladi Prosek 
---
 hw/misc/ivshmem.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index a5a46827fe..6e46669744 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -317,6 +317,10 @@ static int ivshmem_vector_unmask(PCIDevice *dev,
unsigned vector,
 int ret;

 IVSHMEM_DPRINTF("vector unmask %p %d\n", dev, vector);
+if (!v->pdev) {
+error_report("ivshmem: vector %d route does not exist", 
vector);

+return -EINVAL;
+}

 ret = kvm_irqchip_update_msi_route(kvm_state, v->virq, msg, dev);
 if (ret < 0) {
@@ -331,12 +335,16 @@ static void ivshmem_vector_mask(PCIDevice *dev,
unsigned vector)
 {
 IVShmemState *s = IVSHMEM_COMMON(dev);
 EventNotifier *n = &s->peers[s->vm_id].eventfds[vector];
+MSIVector *v = &s->msi_vectors[vector];
 int ret;

 IVSHMEM_DPRINTF("vector mask %p %d\n", dev, vector);
+if (!v->pdev) {
+error_report("ivshmem: vector %d route does not exist", 
vector);

+return;
+}

-ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, n,
-
s->msi_vectors[vector].virq);
+ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, n, 
v->virq);

 if (ret != 0) {
 error_report("remove_irqfd_notifier_gsi failed");
 }





[Qemu-devel] PCI Passthrough + AMD + NPT

2017-10-22 Thread geoff--- via Qemu-devel

Hi All,

I have started to dig into why ntp seems to slow down graphics 
performance on
AMD systems using PCI passthrough and figured I would report what I have 
so far
discovered. I have noted the primary point of failure seems to be 
specifically
with PhysX. This is why people only see a slow down in certain games, 
not

everything uses PhysX.

Using FluidMark[1] the problem is immediately obvious, showing extremely 
low
FPS on light/medium workloads with ntp enabled, and extreme fluididy and 
high

FPS with ntp disabled.

Switching nVidia to use CPU makes no difference to the performance when 
ntp
is enabled, which seems to indicate that PhysX is falling back to CPU 
due to

a failure of some kind to initialize.

With ntp turned off, and nVidia set to use the CPU for PhysX I see an
identical performance drop off in FluidMark as I see when ntp is 
enabled, this

would seem to confirm this suspicion.

Since other features such as APIC is only available if ntp is enabled, 
it

could be something down stream of ntp that is getting disabled as a
consequence of turning off ntp. It might be interesting to see if we can 
get
some diagnostics information out of PhysX to see what if any error or 
debugging

information it might provide when it falls back to CPU.

1: 
http://www.geeks3d.com/20130308/fluidmark-1-5-1-physx-benchmark-fluid-sph-simulation-opengl-download/


Kind Regards,
Geoffrey McRae



Re: [Qemu-devel] ivshmem Windows Driver

2017-10-19 Thread geoff--- via Qemu-devel

On 2017-10-19 20:51, Ladi Prosek wrote:

On Thu, Oct 19, 2017 at 11:41 AM,   wrote:

On 2017-10-19 20:07, ge...@hostfission.com wrote:


On 2017-10-19 20:01, Ladi Prosek wrote:


On Thu, Oct 19, 2017 at 10:44 AM,   wrote:


On 2017-10-19 19:35, Ladi Prosek wrote:



On Wed, Oct 18, 2017 at 5:04 PM,   wrote:



Hi Ladi & Yan,

I am pleased to present the completed driver for review, please 
see:


https://github.com/gnif/kvm-guest-drivers-windows




Awesome!

Feel free to open pull request, it should be easier to comment on.




Great, I will do so after I have addressed the below. Thanks again.



I have created a PR, see:

https://github.com/virtio-win/kvm-guest-drivers-windows/pull/174



* WoW considerations: It would be nice if the driver could detect 
that
the map request is coming from a 32-bit process and expect a 
different

layout of struct IVSHMEM_MMAP.




I did think of this but I am unsure as to how to detect this.



I don't think I ever used it but IoIs32bitProcess() looks promising.




Obviously PVOID will be 32bit which will mess with the struct size and
offset of vectors but I am not aware of a solution to this. If you 
have
any suggestions on how to rectify this it would be very much 
appreciated.


I was thinking something simple like:

#ifdef _WIN64
typedef struct IVSHMEM_MMAP_32
{
...
UINT32 ptr;
...
}
IVSHMEM_MMAP_32, *PIVSHMEM_MMAP_32;
#endif

in a private header. Then in the IOCTL handler call IoIs32bitProcess()
and if it returns true, expect IVSHMEM_MMAP_32 instead of
IVSHMEM_MMAP.


Ah that makes sense, thanks! This has been done.





* It would be cleaner to use READ_REGISTER_* and WRITE_REGISTER_*
from/to IVSHMEMDeviceRegisters instead of plain memory accesses. 
Or at

the very least the accesses should be marked volatile.




I thought that mapping the IO space was enough for this since it is
mapped as non-cacheable. I can see the point of marking it volatile 
but
see no need to use the read/write register semantics. If this is 
what it

takes however I am happy to do so.



Code like this raises eyebrows:

  deviceContext->devRegisters->doorbell |= (UINT32)in->vector |
(in->peerID << 16);

Many readers will probably be wondering what exactly the compiler is
allowed to do with this statement. May it end up ORing the lower and
upper word separately, for example?

  OR [word ptr addr], in->vector
  OR [word ptr addr + 2], in->peerID

And, by the way, is OR really what we want here?



After double checking this you are dead right, the register is 
documented

as write only. I will fix this.



Done.






* In ivshmem.inf: ManufacturerName="Red Hat, Inc." instead of 
"RedHat"




No worries.

* Is any of the API used by the driver Win10-only? Just curious, 
it's

fine to build the driver only for Win10 for now even if it isn't.




I have not tried to build it on anything older then win 10 build 
10586
as I have nothing older, but AFAIK it should build on windows 8.1 
or
later just fine. This is more due to my lack of familiarity with 
Visual

Studio, give me gcc and vim any day :).



Gotcha, no worries, other versions can be tested later.








Re: [Qemu-devel] ivshmem Windows Driver

2017-10-19 Thread geoff--- via Qemu-devel

On 2017-10-19 20:07, ge...@hostfission.com wrote:

On 2017-10-19 20:01, Ladi Prosek wrote:

On Thu, Oct 19, 2017 at 10:44 AM,   wrote:

On 2017-10-19 19:35, Ladi Prosek wrote:


On Wed, Oct 18, 2017 at 5:04 PM,   wrote:


Hi Ladi & Yan,

I am pleased to present the completed driver for review, please 
see:


https://github.com/gnif/kvm-guest-drivers-windows



Awesome!

Feel free to open pull request, it should be easier to comment on.



Great, I will do so after I have addressed the below. Thanks again.



I have created a PR, see:

https://github.com/virtio-win/kvm-guest-drivers-windows/pull/174



* WoW considerations: It would be nice if the driver could detect 
that
the map request is coming from a 32-bit process and expect a 
different

layout of struct IVSHMEM_MMAP.



I did think of this but I am unsure as to how to detect this.


I don't think I ever used it but IoIs32bitProcess() looks promising.




Obviously PVOID will be 32bit which will mess with the struct size and
offset of vectors but I am not aware of a solution to this. If you have
any suggestions on how to rectify this it would be very much 
appreciated.




* It would be cleaner to use READ_REGISTER_* and WRITE_REGISTER_*
from/to IVSHMEMDeviceRegisters instead of plain memory accesses. Or 
at

the very least the accesses should be marked volatile.



I thought that mapping the IO space was enough for this since it is
mapped as non-cacheable. I can see the point of marking it volatile 
but
see no need to use the read/write register semantics. If this is what 
it

takes however I am happy to do so.


Code like this raises eyebrows:

  deviceContext->devRegisters->doorbell |= (UINT32)in->vector |
(in->peerID << 16);

Many readers will probably be wondering what exactly the compiler is
allowed to do with this statement. May it end up ORing the lower and
upper word separately, for example?

  OR [word ptr addr], in->vector
  OR [word ptr addr + 2], in->peerID

And, by the way, is OR really what we want here?



After double checking this you are dead right, the register is 
documented

as write only. I will fix this.


Done.





* In ivshmem.inf: ManufacturerName="Red Hat, Inc." instead of 
"RedHat"




No worries.

* Is any of the API used by the driver Win10-only? Just curious, 
it's

fine to build the driver only for Win10 for now even if it isn't.



I have not tried to build it on anything older then win 10 build 
10586

as I have nothing older, but AFAIK it should build on windows 8.1 or
later just fine. This is more due to my lack of familiarity with 
Visual

Studio, give me gcc and vim any day :).


Gotcha, no worries, other versions can be tested later.





Re: [Qemu-devel] ivshmem Windows Driver

2017-10-19 Thread geoff--- via Qemu-devel

On 2017-10-19 20:01, Ladi Prosek wrote:

On Thu, Oct 19, 2017 at 10:44 AM,   wrote:

On 2017-10-19 19:35, Ladi Prosek wrote:


On Wed, Oct 18, 2017 at 5:04 PM,   wrote:


Hi Ladi & Yan,

I am pleased to present the completed driver for review, please see:

https://github.com/gnif/kvm-guest-drivers-windows



Awesome!

Feel free to open pull request, it should be easier to comment on.



Great, I will do so after I have addressed the below. Thanks again.



* WoW considerations: It would be nice if the driver could detect 
that
the map request is coming from a 32-bit process and expect a 
different

layout of struct IVSHMEM_MMAP.



I did think of this but I am unsure as to how to detect this.


I don't think I ever used it but IoIs32bitProcess() looks promising.



* It would be cleaner to use READ_REGISTER_* and WRITE_REGISTER_*
from/to IVSHMEMDeviceRegisters instead of plain memory accesses. Or 
at

the very least the accesses should be marked volatile.



I thought that mapping the IO space was enough for this since it is
mapped as non-cacheable. I can see the point of marking it volatile 
but
see no need to use the read/write register semantics. If this is what 
it

takes however I am happy to do so.


Code like this raises eyebrows:

  deviceContext->devRegisters->doorbell |= (UINT32)in->vector |
(in->peerID << 16);

Many readers will probably be wondering what exactly the compiler is
allowed to do with this statement. May it end up ORing the lower and
upper word separately, for example?

  OR [word ptr addr], in->vector
  OR [word ptr addr + 2], in->peerID

And, by the way, is OR really what we want here?



After double checking this you are dead right, the register is 
documented

as write only. I will fix this.



* In ivshmem.inf: ManufacturerName="Red Hat, Inc." instead of 
"RedHat"




No worries.


* Is any of the API used by the driver Win10-only? Just curious, it's
fine to build the driver only for Win10 for now even if it isn't.



I have not tried to build it on anything older then win 10 build 10586
as I have nothing older, but AFAIK it should build on windows 8.1 or
later just fine. This is more due to my lack of familiarity with 
Visual

Studio, give me gcc and vim any day :).


Gotcha, no worries, other versions can be tested later.





Re: [Qemu-devel] ivshmem Windows Driver

2017-10-19 Thread geoff--- via Qemu-devel

On 2017-10-19 19:35, Ladi Prosek wrote:

On Wed, Oct 18, 2017 at 5:04 PM,   wrote:

Hi Ladi & Yan,

I am pleased to present the completed driver for review, please see:

https://github.com/gnif/kvm-guest-drivers-windows


Awesome!

Feel free to open pull request, it should be easier to comment on.


Great, I will do so after I have addressed the below. Thanks again.



* WoW considerations: It would be nice if the driver could detect that
the map request is coming from a 32-bit process and expect a different
layout of struct IVSHMEM_MMAP.


I did think of this but I am unsure as to how to detect this.



* It would be cleaner to use READ_REGISTER_* and WRITE_REGISTER_*
from/to IVSHMEMDeviceRegisters instead of plain memory accesses. Or at
the very least the accesses should be marked volatile.


I thought that mapping the IO space was enough for this since it is
mapped as non-cacheable. I can see the point of marking it volatile but
see no need to use the read/write register semantics. If this is what it
takes however I am happy to do so.



* In ivshmem.inf: ManufacturerName="Red Hat, Inc." instead of "RedHat"



No worries.


* Is any of the API used by the driver Win10-only? Just curious, it's
fine to build the driver only for Win10 for now even if it isn't.


I have not tried to build it on anything older then win 10 build 10586
as I have nothing older, but AFAIK it should build on windows 8.1 or
later just fine. This is more due to my lack of familiarity with Visual
Studio, give me gcc and vim any day :).



Thanks!


All issues previously mentioned have been addressed and all missing
functionality has been added.

Please note that this work has exposed a bug in the qemu ivshmem
virtual device itself, it seems that if the MSI interrupts are enabled
and the driver is unloaded twice an assertion is thrown due to what
looks to be a double free, crashing out qemu.

Once this driver has been finalized I will look into the cause of
this problem and see if I can correct it also.

Kind Regards,
Geoffrey McRae





Re: [Qemu-devel] ivshmem Windows Driver

2017-10-18 Thread geoff--- via Qemu-devel

Hi Ladi & Yan,

I am pleased to present the completed driver for review, please see:

https://github.com/gnif/kvm-guest-drivers-windows

All issues previously mentioned have been addressed and all missing
functionality has been added.

Please note that this work has exposed a bug in the qemu ivshmem
virtual device itself, it seems that if the MSI interrupts are enabled
and the driver is unloaded twice an assertion is thrown due to what
looks to be a double free, crashing out qemu.

Once this driver has been finalized I will look into the cause of
this problem and see if I can correct it also.

Kind Regards,
Geoffrey McRae



Re: [Qemu-devel] ivshmem Windows Driver

2017-10-17 Thread geoff--- via Qemu-devel

On 2017-10-18 17:50, Ladi Prosek wrote:

On Wed, Oct 18, 2017 at 7:50 AM,   wrote:

On 2017-10-18 16:31, Ladi Prosek wrote:


Hi Geoff,

On Mon, Oct 16, 2017 at 8:31 PM,   wrote:


Hi Yan & Ladi.

I have written an initial implementation that supports just the 
shared

memory
mapping at this time. I plan to add events also but before I go 
further I

would
like some feedback if possible on what I have implemented thus far.

Please see:


https://github.com/gnif/kvm-guest-drivers-windows/commit/8655cf12fbdd77b991f96d97bc20f967b5907c12



Thank you, looks good overall.

* Please don't use the 'vio' prefix for this driver. ivshmem is not a
VirtIO device (i.e. not using the VirtIO protocol). Also the test
program should live in a subdirectory, so maybe something like
/ivshmem and /ivshmem/test.



Noted, I will remove the prefix throughout and move the test 
application.




* In VIOIVSHMEMEvtDevicePrepareHardware: I don't think that Windows
guarantees that resources are enumerated in BAR order. In VirtIO
drivers we read the PCI config space to identify the BAR index:

https://github.com/virtio-win/kvm-guest-drivers-windows/blob/master/VirtIO/VirtIOPCICommon.c#L353



The windows 'toaster' sample relies on the resource order, but as a
belt and braces approach I will update the code to use the same
approach.


Interesting, thanks! If that's really the case then we can remove the
code from VirtioLib. I have cloned the latest Windows-driver-samples
but can't find this under general/toaster. Namely
ToasterEvtDevicePrepareHardware just prints some info about all
resources but does not do anything order-related. Can you point me to
the right code?



Sorry, my mistake, it wasn't the toaster code but the kmdf driver, it
assumes the BAR ordering to determine which is which.

https://github.com/Microsoft/Windows-driver-samples/blob/aa6e0b36eb932099fa4eb950a6f5e289a23b6d6e/general/pcidrv/kmdf/HW/nic_init.c#L649



* IOCTL codes on Windows have a structure to them:

https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/defining-i-o-control-codes



Thanks, I will fix this.



* In VIOIVSHMEMEvtIoDeviceControl: The "only one mapping at a time is
allowed" test has a race. I think that simply making the IO queue
WdfIoQueueDispatchSequential instead of WdfIoQueueDispatchParallel
will fix it.



Good point, I will change this.



* According to MSDN, MmMapLockedPagesSpecifyCache(UserMode) should be
wrapped in try/except. Also, what happens if the file handle is
inherited by a child process? Can it unmap the mapping in parent's
address space? What if the parent exits? A possible solution is
discussed in this article:
http://www.osronline.com/article.cfm?article=39



Noted re try/except. As for a child inheriting it, the owner is 
tracked
by the WDFFILEOBJECT, which the child I believe will inherit also, 
which
would mean that the child would gain the ability to issue IOCTLs to 
the

mapping.



Thanks!
Ladi




No, thank you! I am grateful someone is willing to provide some 
feedback

on this.

I have been working on adding MSI interrupt support to the driver
also which is close to ready, just trying to figure out why the driver
fails to start with STATUS_DEVICE_POWER_FAILURE when I try to setup 
the

IRQs with WdfInterruptCreate.

Thanks again,
Geoff





Re: [Qemu-devel] ivshmem Windows Driver

2017-10-17 Thread geoff--- via Qemu-devel

On 2017-10-18 16:31, Ladi Prosek wrote:

Hi Geoff,

On Mon, Oct 16, 2017 at 8:31 PM,   wrote:

Hi Yan & Ladi.

I have written an initial implementation that supports just the shared
memory
mapping at this time. I plan to add events also but before I go 
further I

would
like some feedback if possible on what I have implemented thus far.

Please see:

https://github.com/gnif/kvm-guest-drivers-windows/commit/8655cf12fbdd77b991f96d97bc20f967b5907c12


Thank you, looks good overall.

* Please don't use the 'vio' prefix for this driver. ivshmem is not a
VirtIO device (i.e. not using the VirtIO protocol). Also the test
program should live in a subdirectory, so maybe something like
/ivshmem and /ivshmem/test.


Noted, I will remove the prefix throughout and move the test 
application.




* In VIOIVSHMEMEvtDevicePrepareHardware: I don't think that Windows
guarantees that resources are enumerated in BAR order. In VirtIO
drivers we read the PCI config space to identify the BAR index:
https://github.com/virtio-win/kvm-guest-drivers-windows/blob/master/VirtIO/VirtIOPCICommon.c#L353


The windows 'toaster' sample relies on the resource order, but as a
belt and braces approach I will update the code to use the same
approach.



* IOCTL codes on Windows have a structure to them:
https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/defining-i-o-control-codes


Thanks, I will fix this.



* In VIOIVSHMEMEvtIoDeviceControl: The "only one mapping at a time is
allowed" test has a race. I think that simply making the IO queue
WdfIoQueueDispatchSequential instead of WdfIoQueueDispatchParallel
will fix it.


Good point, I will change this.



* According to MSDN, MmMapLockedPagesSpecifyCache(UserMode) should be
wrapped in try/except. Also, what happens if the file handle is
inherited by a child process? Can it unmap the mapping in parent's
address space? What if the parent exits? A possible solution is
discussed in this article:
http://www.osronline.com/article.cfm?article=39


Noted re try/except. As for a child inheriting it, the owner is tracked
by the WDFFILEOBJECT, which the child I believe will inherit also, which
would mean that the child would gain the ability to issue IOCTLs to the
mapping.



Thanks!
Ladi



No, thank you! I am grateful someone is willing to provide some feedback
on this.

I have been working on adding MSI interrupt support to the driver
also which is close to ready, just trying to figure out why the driver
fails to start with STATUS_DEVICE_POWER_FAILURE when I try to setup the
IRQs with WdfInterruptCreate.

Thanks again,
Geoff



Re: [Qemu-devel] ivshmem Windows Driver

2017-10-16 Thread geoff--- via Qemu-devel

Hi Yan & Ladi.

I have written an initial implementation that supports just the shared 
memory
mapping at this time. I plan to add events also but before I go further 
I would

like some feedback if possible on what I have implemented thus far.

Please see:

https://github.com/gnif/kvm-guest-drivers-windows/commit/8655cf12fbdd77b991f96d97bc20f967b5907c12

Kind Regards,
Geoff

On 2017-10-15 23:29, ge...@hostfission.com wrote:

On 2017-10-15 23:24, Yan Vugenfirer wrote:

On 15 Oct 2017, at 15:21, ge...@hostfission.com wrote:

Hi Yan,

Thank you for the information. I am rather new to Windows Driver 
development and learning as I go, so this may take some time, but 
since the driver only needs to perform very basic functions I do not 
see this as being too much of a challenge.


I think you can look into Windows virtio-balloon implementation as an
example of simple driver:
https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/Balloon

It relies on virtio library
(https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/VirtIO)
and it is WDF driver (MS framework that simplifies the drivers
development) that makes it very simple.



Thanks again, I already have a prototype driver working using WDF,
it's more learning the windows internals and how to best implement
things.



-Geoff

On 2017-10-15 22:14, Yan Vugenfirer wrote:

He Geoff,
The official virtio-win drivers upstream repository is here:
https://github.com/virtio-win/kvm-guest-drivers-windows
1. There is no ivshmem Windows Driver for now as far as I know
2. We are signing the drivers for community usage
https://fedoraproject.org/wiki/Windows_Virtio_Drivers from the same
repository.
The process will be: submit the code for review with pull request
(better use existing virtio library for virtio communication between
the guest and the host), pass internal tests and at the least being
able to pass MS HCK\HLK tests, later on the driver will be pulled 
into

official build and release with rest of the drivers for community
usage.
3. We are happy to cooperate on adding new functionality to current
package of virtio drivers for Windows
4. As already mentioned: 
https://github.com/virtio-win/kvm-guest-drivers-windows

Thanks a lot!
If you have more questions, please don’t hesitate to talk to me, 
Ladi

or anyone else from Red Hat involved with virtio-win development.
Best regards,
Yan.
On 15 Oct 2017, at 12:32, geoff--- via Qemu-devel 
 wrote:

Hi All,
I am writing some code that needs to share a block of ram between a 
Windows guest and Linux host. For this I am using the ivshmem 
device and I have written a very primitive driver for windows that 
allows a single application to request to memory map the pci bar 
(shared memory) into the program's context using DeviceIoControl.
This is all working fine, but the next problem is I need the driver 
to be signed. In it's current state I would not even suggest it be 
signed as it was just hacked together to test my concept, but now I 
know it's viable I would be willing to invest whatever time is 
required to write a driver that would be acceptable for signing.
The ideal driver would be general purpose and could be leveraged 
for any user mode application use, not just my specific case. It 
would need to implement the IRQ/even features of ivshmem and 
possibly even some kind of security to prevent unauthorized use by 
rogue applications (shared secret configured on the chardev?).

I have several qustions:
1) Has someone done this? I can't find any reference to a windows 
driver for this device anywhere.
2) If I was to pursue writing this driver, how would be the best 
way to go about it so as to ensure that it is in a state that it 
could be signed with the RedHat vendor key?

3) What is the likelihood of having such a driver signed?
4) Is there a preferred git host for such a driver?
Kind Regards
-Geoff







[Qemu-devel] ivshmem Windows Driver

2017-10-15 Thread geoff--- via Qemu-devel

Hi All,

I am writing some code that needs to share a block of ram between a 
Windows guest and Linux host. For this I am using the ivshmem device and 
I have written a very primitive driver for windows that allows a single 
application to request to memory map the pci bar (shared memory) into 
the program's context using DeviceIoControl.


This is all working fine, but the next problem is I need the driver to 
be signed. In it's current state I would not even suggest it be signed 
as it was just hacked together to test my concept, but now I know it's 
viable I would be willing to invest whatever time is required to write a 
driver that would be acceptable for signing.


The ideal driver would be general purpose and could be leveraged for any 
user mode application use, not just my specific case. It would need to 
implement the IRQ/even features of ivshmem and possibly even some kind 
of security to prevent unauthorized use by rogue applications (shared 
secret configured on the chardev?).


I have several qustions:

  1) Has someone done this? I can't find any reference to a windows 
driver for this device anywhere.
  2) If I was to pursue writing this driver, how would be the best way 
to go about it so as to ensure that it is in a state that it could be 
signed with the RedHat vendor key?

  3) What is the likelihood of having such a driver signed?
  4) Is there a preferred git host for such a driver?

Kind Regards
-Geoff



[Qemu-devel] [PATCH] [pckbd] Prevent IRQs when the guest disables the mouse

2017-10-12 Thread geoff--- via Qemu-devel
When the guest OS needs to send the mouse commands it will at least in 
the case
of Windows 10 set the KBD_MODE_DISABLE_MOUSE bit to prevent interrupts 
from

causing stream desynchronisation.

Here is Windows 10 attempting to issue a PS/2 mouse reset without this 
fix where
you can see the mouse positional data was returned as the answer to the 
get type

command.

KBD: kbd: write cmd=0xd4  // write next cmd to the aux port
KBD: kbd: read status=0x1c
KBD: kbd: read status=0x1c
KBD: kbd: read status=0x1c
KBD: kbd: write data=0xff
kbd: write mouse 0xff // reset
KBD: kbd: read status=0x3d
KBD: kbd: read status=0x3d
KBD: kbd: read data=0xfa  // ack
KBD: kbd: read status=0x3d
KBD: kbd: read status=0x3d
KBD: kbd: read data=0xaa  // self-test good
KBD: kbd: read status=0x3d
KBD: kbd: read status=0x3d
KBD: kbd: read data=0x00  // the device type
KBD: kbd: read status=0x3d
KBD: kbd: write cmd=0xd4  // write cmd to the aux port
KBD: kbd: read status=0x3d
KBD: kbd: write data=0xf2
kbd: write mouse 0xf2 // get type
KBD: kbd: read status=0x3d
KBD: kbd: read status=0x3d
KBD: kbd: read data=0x08  // mouse data byte 1
KBD: kbd: read status=0x3d
KBD: kbd: read status=0x3d
KBD: kbd: read data=0x00  // mouse data byte 2
KBD: kbd: read status=0x3d
KBD: kbd: read status=0x3d
KBD: kbd: read data=0x00  // mouse data byte 3
KBD: kbd: read status=0x3d
KBD: kbd: read status=0x3d
KBD: kbd: read data=0xfa  // the ack for the get type above
KBD: kbd: read status=0x3d
KBD: kbd: read status=0x3d
KBD: kbd: read data=0x00  // the device type
KBD: kbd: read status=0x3d
KBD: kbd: read status=0x3d
KBD: kbd: read data=0x08  // mouse data byte 1
KBD: kbd: read status=0x3d
KBD: kbd: read status=0x3d
KBD: kbd: read data=0x00  // mouse data byte 2

Signed-off-by: Geoffrey McRae 
---
 hw/input/pckbd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
index c479f827b6..78d5356817 100644
--- a/hw/input/pckbd.c
+++ b/hw/input/pckbd.c
@@ -168,7 +168,8 @@ static void kbd_update_irq(KBDState *s)
 if (s->pending == KBD_PENDING_AUX) {
 s->status |= KBD_STAT_MOUSE_OBF;
 s->outport |= KBD_OUT_MOUSE_OBF;
-if (s->mode & KBD_MODE_MOUSE_INT)
+if ((s->mode & KBD_MODE_MOUSE_INT) &&
+!(s->mode & KBD_MODE_DISABLE_MOUSE))
 irq_mouse_level = 1;
 } else {
 if ((s->mode & KBD_MODE_KBD_INT) &&
--
2.11.0