Re: kvm PCI assignment & VFIO ramblings

2011-08-30 Thread Joerg Roedel
On Sun, Aug 28, 2011 at 05:04:32PM +0300, Avi Kivity wrote: > On 08/28/2011 04:56 PM, Joerg Roedel wrote: >> This can't be secured by a lock, because it introduces potential >> A->B<-->B->A lock problem when two processes try to take each others mm. >> It could probably be solved by a task->real_m

Re: kvm PCI assignment & VFIO ramblings

2011-08-30 Thread Joerg Roedel
On Fri, Aug 26, 2011 at 12:04:22PM -0600, Alex Williamson wrote: > On Thu, 2011-08-25 at 20:05 +0200, Joerg Roedel wrote: > > If we really expect segment numbers that need the full 16 bit then this > > would be the way to go. Otherwise I would prefer returning the group-id > > directly and partiti

Re: kvm PCI assignment & VFIO ramblings

2011-08-29 Thread David Gibson
eOn Fri, Aug 26, 2011 at 01:17:05PM -0700, Aaron Fabbri wrote: [snip] > Yes. In essence, I'd rather not have to run any other admin processes. > Doing things programmatically, on the fly, from each process, is the > cleanest model right now. The "persistent group" model doesn't necessarily preven

Re: kvm PCI assignment & VFIO ramblings

2011-08-28 Thread Avi Kivity
On 08/28/2011 04:56 PM, Joerg Roedel wrote: On Sun, Aug 28, 2011 at 04:14:00PM +0300, Avi Kivity wrote: > On 08/26/2011 12:24 PM, Roedel, Joerg wrote: >> The biggest problem with this approach is that it has to happen in the >> context of the given process. Linux can't really modify an mm whi

Re: kvm PCI assignment & VFIO ramblings

2011-08-28 Thread Joerg Roedel
On Sun, Aug 28, 2011 at 04:14:00PM +0300, Avi Kivity wrote: > On 08/26/2011 12:24 PM, Roedel, Joerg wrote: >> The biggest problem with this approach is that it has to happen in the >> context of the given process. Linux can't really modify an mm which >> which belong to another context in a safe w

Re: kvm PCI assignment & VFIO ramblings

2011-08-28 Thread Avi Kivity
On 08/26/2011 12:24 PM, Roedel, Joerg wrote: > > As I see it there are two options: (a) make subsequent accesses from > userspace or the guest result in either a SIGBUS that userspace must > either deal with or die, or (b) replace the mapping with a dummy RO > mapping containing 0xff, with an

Re: kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Chris Wright
* Aaron Fabbri (aafab...@cisco.com) wrote: > On 8/26/11 12:35 PM, "Chris Wright" wrote: > > * Aaron Fabbri (aafab...@cisco.com) wrote: > >> Each process will open vfio devices on the fly, and they need to be able to > >> share IOMMU resources. > > > > How do you share IOMMU resources w/ multiple

Re: kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Aaron Fabbri
On 8/26/11 12:35 PM, "Chris Wright" wrote: > * Aaron Fabbri (aafab...@cisco.com) wrote: >> On 8/26/11 7:07 AM, "Alexander Graf" wrote: >>> Forget the KVM case for a moment and think of a user space device driver. I >>> as >>> a user am not root. But I as a user when having access to /dev/vfio

Re: kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Chris Wright
* Aaron Fabbri (aafab...@cisco.com) wrote: > On 8/26/11 7:07 AM, "Alexander Graf" wrote: > > Forget the KVM case for a moment and think of a user space device driver. I > > as > > a user am not root. But I as a user when having access to /dev/vfioX want to > > be able to access the device and man

Re: kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Alex Williamson
On Thu, 2011-08-25 at 20:05 +0200, Joerg Roedel wrote: > On Thu, Aug 25, 2011 at 11:20:30AM -0600, Alex Williamson wrote: > > On Thu, 2011-08-25 at 12:54 +0200, Roedel, Joerg wrote: > > > > We need to solve this differently. ARM is starting to use the iommu-api > > > too and this definitly does no

Re: kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Aaron Fabbri
On 8/26/11 7:07 AM, "Alexander Graf" wrote: > > > Forget the KVM case for a moment and think of a user space device driver. I as > a user am not root. But I as a user when having access to /dev/vfioX want to > be able to access the device and manage it - and only it. The admin of that > box

Re: kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Alexander Graf
On 26.08.2011, at 10:24, Joerg Roedel wrote: > On Fri, Aug 26, 2011 at 09:07:35AM -0500, Alexander Graf wrote: >> On 26.08.2011, at 04:33, Roedel, Joerg wrote: >>> >>> The reason is that you mean the usability for the programmer and I mean >>> it for the actual user of qemu :) >> >> No, we mean

Re: kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Joerg Roedel
On Fri, Aug 26, 2011 at 09:07:35AM -0500, Alexander Graf wrote: > On 26.08.2011, at 04:33, Roedel, Joerg wrote: > > > > The reason is that you mean the usability for the programmer and I mean > > it for the actual user of qemu :) > > No, we mean the actual user of qemu. The reason being that maki

Re: kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Alexander Graf
On 26.08.2011, at 04:33, Roedel, Joerg wrote: > On Fri, Aug 26, 2011 at 12:20:00AM -0400, David Gibson wrote: >> On Wed, Aug 24, 2011 at 01:03:32PM +0200, Roedel, Joerg wrote: >>> On Wed, Aug 24, 2011 at 05:33:00AM -0400, David Gibson wrote: On Wed, Aug 24, 2011 at 11:14:26AM +0200, Roedel,

Re: kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Roedel, Joerg
On Fri, Aug 26, 2011 at 12:20:00AM -0400, David Gibson wrote: > On Wed, Aug 24, 2011 at 01:03:32PM +0200, Roedel, Joerg wrote: > > On Wed, Aug 24, 2011 at 05:33:00AM -0400, David Gibson wrote: > > > On Wed, Aug 24, 2011 at 11:14:26AM +0200, Roedel, Joerg wrote: > > > > > > I don't see a reason to

Re: kvm PCI assignment & VFIO ramblings

2011-08-26 Thread Roedel, Joerg
On Fri, Aug 26, 2011 at 12:24:23AM -0400, David Gibson wrote: > On Thu, Aug 25, 2011 at 08:25:45AM -0500, Alexander Graf wrote: > > On 25.08.2011, at 07:31, Roedel, Joerg wrote: > > > For mmio we could stop the guest and replace the mmio region with a > > > region that is filled with 0xff, no? > >

Re: kvm PCI assignment & VFIO ramblings

2011-08-25 Thread David Gibson
On Thu, Aug 25, 2011 at 08:25:45AM -0500, Alexander Graf wrote: > > On 25.08.2011, at 07:31, Roedel, Joerg wrote: > > > On Wed, Aug 24, 2011 at 11:07:46AM -0400, Alex Williamson wrote: > >> On Wed, 2011-08-24 at 10:52 +0200, Roedel, Joerg wrote: > > > > [...] > > >> We need to try the polite m

Re: kvm PCI assignment & VFIO ramblings

2011-08-25 Thread David Gibson
On Wed, Aug 24, 2011 at 01:03:32PM +0200, Roedel, Joerg wrote: > On Wed, Aug 24, 2011 at 05:33:00AM -0400, David Gibson wrote: > > On Wed, Aug 24, 2011 at 11:14:26AM +0200, Roedel, Joerg wrote: > > > > I don't see a reason to make this meta-grouping static. It would harm > > > flexibility on x86.

Re: kvm PCI assignment & VFIO ramblings

2011-08-25 Thread Joerg Roedel
On Thu, Aug 25, 2011 at 11:20:30AM -0600, Alex Williamson wrote: > On Thu, 2011-08-25 at 12:54 +0200, Roedel, Joerg wrote: > > We need to solve this differently. ARM is starting to use the iommu-api > > too and this definitly does not work there. One possible solution might > > be to make the iomm

Re: kvm PCI assignment & VFIO ramblings

2011-08-25 Thread Alex Williamson
On Thu, 2011-08-25 at 12:54 +0200, Roedel, Joerg wrote: > Hi Alex, > > On Wed, Aug 24, 2011 at 05:13:49PM -0400, Alex Williamson wrote: > > Is this roughly what you're thinking of for the iommu_group component? > > Adding a dev_to_group iommu ops callback let's us consolidate the sysfs > > support

Re: kvm PCI assignment & VFIO ramblings

2011-08-25 Thread Roedel, Joerg
On Thu, Aug 25, 2011 at 11:38:09AM -0400, Don Dutile wrote: > On 08/25/2011 06:54 AM, Roedel, Joerg wrote: > > We need to solve this differently. ARM is starting to use the iommu-api > > too and this definitly does not work there. One possible solution might > > be to make the iommu-ops per-bus. >

Re: kvm PCI assignment & VFIO ramblings

2011-08-25 Thread Don Dutile
On 08/25/2011 06:54 AM, Roedel, Joerg wrote: Hi Alex, On Wed, Aug 24, 2011 at 05:13:49PM -0400, Alex Williamson wrote: Is this roughly what you're thinking of for the iommu_group component? Adding a dev_to_group iommu ops callback let's us consolidate the sysfs support in the iommu base. Would

Re: kvm PCI assignment & VFIO ramblings

2011-08-25 Thread Roedel, Joerg
On Wed, Aug 24, 2011 at 11:07:46AM -0400, Alex Williamson wrote: > On Wed, 2011-08-24 at 10:52 +0200, Roedel, Joerg wrote: > > On Tue, Aug 23, 2011 at 01:08:29PM -0400, Alex Williamson wrote: > > > On Tue, 2011-08-23 at 15:14 +0200, Roedel, Joerg wrote: > > > > > > Handling it through fds is a goo

Re: kvm PCI assignment & VFIO ramblings

2011-08-25 Thread Roedel, Joerg
On Wed, Aug 24, 2011 at 10:56:13AM -0400, Alex Williamson wrote: > On Wed, 2011-08-24 at 10:43 +0200, Joerg Roedel wrote: > > A side-note: Might it be better to expose assigned devices in a guest on > > a seperate bus? This will make it easier to emulate an IOMMU for the > > guest inside qemu. > >

Re: kvm PCI assignment & VFIO ramblings

2011-08-25 Thread Roedel, Joerg
Hi Alex, On Wed, Aug 24, 2011 at 05:13:49PM -0400, Alex Williamson wrote: > Is this roughly what you're thinking of for the iommu_group component? > Adding a dev_to_group iommu ops callback let's us consolidate the sysfs > support in the iommu base. Would AMD-Vi do something similar (or > exactly

Re: kvm PCI assignment & VFIO ramblings

2011-08-24 Thread Alex Williamson
Joerg, Is this roughly what you're thinking of for the iommu_group component? Adding a dev_to_group iommu ops callback let's us consolidate the sysfs support in the iommu base. Would AMD-Vi do something similar (or exactly the same) for group #s? Thanks, Alex Signed-off-by: Alex Williamson d

Re: kvm PCI assignment & VFIO ramblings

2011-08-24 Thread Alex Williamson
On Wed, 2011-08-24 at 10:52 +0200, Roedel, Joerg wrote: > On Tue, Aug 23, 2011 at 01:08:29PM -0400, Alex Williamson wrote: > > On Tue, 2011-08-23 at 15:14 +0200, Roedel, Joerg wrote: > > > > Handling it through fds is a good idea. This makes sure that everything > > > belongs to one process. I am

Re: kvm PCI assignment & VFIO ramblings

2011-08-24 Thread Alex Williamson
On Wed, 2011-08-24 at 10:43 +0200, Joerg Roedel wrote: > On Tue, Aug 23, 2011 at 03:30:06PM -0400, Alex Williamson wrote: > > On Tue, 2011-08-23 at 07:01 +1000, Benjamin Herrenschmidt wrote: > > > > Could be tho in what form ? returning sysfs pathes ? > > > > I'm at a loss there, please suggest.

Re: kvm PCI assignment & VFIO ramblings

2011-08-24 Thread Alex Williamson
On Wed, 2011-08-24 at 09:51 +1000, Benjamin Herrenschmidt wrote: > > > For us the most simple and logical approach (which is also what pHyp > > > uses and what Linux handles well) is really to expose a given PCI host > > > bridge per group to the guest. Believe it or not, it makes things > > > easi

Re: kvm PCI assignment & VFIO ramblings

2011-08-24 Thread Roedel, Joerg
On Wed, Aug 24, 2011 at 05:33:00AM -0400, David Gibson wrote: > On Wed, Aug 24, 2011 at 11:14:26AM +0200, Roedel, Joerg wrote: > > I don't see a reason to make this meta-grouping static. It would harm > > flexibility on x86. I think it makes things easier on power but there > > are options on that

Re: kvm PCI assignment & VFIO ramblings

2011-08-24 Thread David Gibson
On Wed, Aug 24, 2011 at 11:14:26AM +0200, Roedel, Joerg wrote: > On Tue, Aug 23, 2011 at 12:54:27PM -0400, aafabbri wrote: > > On 8/23/11 4:04 AM, "Joerg Roedel" wrote: > > > That is makes uiommu basically the same as the meta-groups, right? > > > > Yes, functionality seems the same, thus my sugg

Re: kvm PCI assignment & VFIO ramblings

2011-08-24 Thread Roedel, Joerg
On Tue, Aug 23, 2011 at 12:54:27PM -0400, aafabbri wrote: > On 8/23/11 4:04 AM, "Joerg Roedel" wrote: > > That is makes uiommu basically the same as the meta-groups, right? > > Yes, functionality seems the same, thus my suggestion to keep uiommu > explicit. Is there some need for group-groups be

Re: kvm PCI assignment & VFIO ramblings

2011-08-24 Thread Joerg Roedel
On Tue, Aug 23, 2011 at 01:33:14PM -0400, Aaron Fabbri wrote: > On 8/23/11 10:01 AM, "Alex Williamson" wrote: > > The iommu domain would probably be allocated when the first device is > > bound to vfio. As each device is bound, it gets attached to the group. > > DMAs are done via an ioctl on the

Re: kvm PCI assignment & VFIO ramblings

2011-08-24 Thread Roedel, Joerg
On Tue, Aug 23, 2011 at 07:35:37PM -0400, Benjamin Herrenschmidt wrote: > On Tue, 2011-08-23 at 15:18 +0200, Roedel, Joerg wrote: > > Hmm, good idea. But as far as I know the hotplug-event needs to be in > > the guest _before_ the device is actually unplugged (so that the guest > > can unbind its

Re: kvm PCI assignment & VFIO ramblings

2011-08-24 Thread Roedel, Joerg
On Tue, Aug 23, 2011 at 01:08:29PM -0400, Alex Williamson wrote: > On Tue, 2011-08-23 at 15:14 +0200, Roedel, Joerg wrote: > > Handling it through fds is a good idea. This makes sure that everything > > belongs to one process. I am not really sure yet if we go the way to > > just bind plain groups

Re: kvm PCI assignment & VFIO ramblings

2011-08-24 Thread Joerg Roedel
On Tue, Aug 23, 2011 at 03:30:06PM -0400, Alex Williamson wrote: > On Tue, 2011-08-23 at 07:01 +1000, Benjamin Herrenschmidt wrote: > > Could be tho in what form ? returning sysfs pathes ? > > I'm at a loss there, please suggest. I think we need an ioctl that > returns some kind of array of devi

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Alexander Graf
On 23.08.2011, at 18:51, Benjamin Herrenschmidt wrote: > >>> For us the most simple and logical approach (which is also what pHyp >>> uses and what Linux handles well) is really to expose a given PCI host >>> bridge per group to the guest. Believe it or not, it makes things >>> easier :-) >> >>

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Alexander Graf
On 23.08.2011, at 18:41, Benjamin Herrenschmidt wrote: > On Tue, 2011-08-23 at 10:23 -0600, Alex Williamson wrote: >> >> Yeah. Joerg's idea of binding groups internally (pass the fd of one >> group to another via ioctl) is one option. The tricky part will be >> implementing it to support hot u

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Benjamin Herrenschmidt
> > For us the most simple and logical approach (which is also what pHyp > > uses and what Linux handles well) is really to expose a given PCI host > > bridge per group to the guest. Believe it or not, it makes things > > easier :-) > > I'm all for easier. Why does exposing the bridge use less b

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Benjamin Herrenschmidt
On Tue, 2011-08-23 at 10:23 -0600, Alex Williamson wrote: > > Yeah. Joerg's idea of binding groups internally (pass the fd of one > group to another via ioctl) is one option. The tricky part will be > implementing it to support hot unplug of any group from the > supergroup. > I believe Ben had a

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Benjamin Herrenschmidt
On Tue, 2011-08-23 at 15:18 +0200, Roedel, Joerg wrote: > On Mon, Aug 22, 2011 at 05:03:53PM -0400, Benjamin Herrenschmidt wrote: > > > > > I am in favour of /dev/vfio/$GROUP. If multiple devices should be > > > assigned to a guest, there can also be an ioctl to bind a group to an > > > address-sp

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Alex Williamson
On Tue, 2011-08-23 at 07:01 +1000, Benjamin Herrenschmidt wrote: > On Mon, 2011-08-22 at 09:45 -0600, Alex Williamson wrote: > > > Yes, that's the idea. An open question I have towards the configuration > > side is whether we might add iommu driver specific options to the > > groups. For instanc

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Alex Williamson
On Tue, 2011-08-23 at 10:33 -0700, Aaron Fabbri wrote: > > > On 8/23/11 10:01 AM, "Alex Williamson" wrote: > > > On Tue, 2011-08-23 at 16:54 +1000, Benjamin Herrenschmidt wrote: > >> On Mon, 2011-08-22 at 17:52 -0700, aafabbri wrote: > >> > >>> I'm not following you. > >>> > >>> You have to e

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Aaron Fabbri
On 8/23/11 10:01 AM, "Alex Williamson" wrote: > On Tue, 2011-08-23 at 16:54 +1000, Benjamin Herrenschmidt wrote: >> On Mon, 2011-08-22 at 17:52 -0700, aafabbri wrote: >> >>> I'm not following you. >>> >>> You have to enforce group/iommu domain assignment whether you have the >>> existing uio

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Alex Williamson
On Tue, 2011-08-23 at 15:14 +0200, Roedel, Joerg wrote: > On Mon, Aug 22, 2011 at 03:17:00PM -0400, Alex Williamson wrote: > > On Mon, 2011-08-22 at 19:25 +0200, Joerg Roedel wrote: > > > > I am in favour of /dev/vfio/$GROUP. If multiple devices should be > > > assigned to a guest, there can also

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Alex Williamson
On Tue, 2011-08-23 at 16:54 +1000, Benjamin Herrenschmidt wrote: > On Mon, 2011-08-22 at 17:52 -0700, aafabbri wrote: > > > I'm not following you. > > > > You have to enforce group/iommu domain assignment whether you have the > > existing uiommu API, or if you change it to your proposed > > ioctl

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread aafabbri
On 8/23/11 4:04 AM, "Joerg Roedel" wrote: > On Mon, Aug 22, 2011 at 08:52:18PM -0400, aafabbri wrote: >> You have to enforce group/iommu domain assignment whether you have the >> existing uiommu API, or if you change it to your proposed >> ioctl(inherit_iommu) API. >> >> The only change neede

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Alex Williamson
On Tue, 2011-08-23 at 12:38 +1000, David Gibson wrote: > On Mon, Aug 22, 2011 at 09:45:48AM -0600, Alex Williamson wrote: > > On Mon, 2011-08-22 at 15:55 +1000, David Gibson wrote: > > > On Sat, Aug 20, 2011 at 09:51:39AM -0700, Alex Williamson wrote: > > > > We had an extremely productive VFIO BoF

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Roedel, Joerg
On Mon, Aug 22, 2011 at 03:17:00PM -0400, Alex Williamson wrote: > On Mon, 2011-08-22 at 19:25 +0200, Joerg Roedel wrote: > > I am in favour of /dev/vfio/$GROUP. If multiple devices should be > > assigned to a guest, there can also be an ioctl to bind a group to an > > address-space of another gro

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Roedel, Joerg
On Mon, Aug 22, 2011 at 05:03:53PM -0400, Benjamin Herrenschmidt wrote: > > > I am in favour of /dev/vfio/$GROUP. If multiple devices should be > > assigned to a guest, there can also be an ioctl to bind a group to an > > address-space of another group (certainly needs some care to not allow > > t

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Joerg Roedel
On Tue, Aug 23, 2011 at 02:54:43AM -0400, Benjamin Herrenschmidt wrote: > Possibly, the question that interest me the most is what interface will > KVM end up using. I'm also not terribly fan with the (perceived) > discrepancy between using uiommu to create groups but using the group fd > to actual

Re: kvm PCI assignment & VFIO ramblings

2011-08-23 Thread Joerg Roedel
On Mon, Aug 22, 2011 at 08:52:18PM -0400, aafabbri wrote: > You have to enforce group/iommu domain assignment whether you have the > existing uiommu API, or if you change it to your proposed > ioctl(inherit_iommu) API. > > The only change needed to VFIO here should be to make uiommu fd assignment

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Benjamin Herrenschmidt
On Mon, 2011-08-22 at 17:52 -0700, aafabbri wrote: > I'm not following you. > > You have to enforce group/iommu domain assignment whether you have the > existing uiommu API, or if you change it to your proposed > ioctl(inherit_iommu) API. > > The only change needed to VFIO here should be to make

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread David Gibson
On Mon, Aug 22, 2011 at 09:45:48AM -0600, Alex Williamson wrote: > On Mon, 2011-08-22 at 15:55 +1000, David Gibson wrote: > > On Sat, Aug 20, 2011 at 09:51:39AM -0700, Alex Williamson wrote: > > > We had an extremely productive VFIO BoF on Monday. Here's my attempt to > > > capture the plan that I

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread aafabbri
On 8/22/11 2:49 PM, "Benjamin Herrenschmidt" wrote: > >>> I wouldn't use uiommu for that. >> >> Any particular reason besides saving a file descriptor? >> >> We use it today, and it seems like a cleaner API than what you propose >> changing it to. > > Well for one, we are back to square on

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Benjamin Herrenschmidt
> > I wouldn't use uiommu for that. > > Any particular reason besides saving a file descriptor? > > We use it today, and it seems like a cleaner API than what you propose > changing it to. Well for one, we are back to square one vs. grouping constraints. .../... > If we in singleton-group la

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread aafabbri
On 8/22/11 1:49 PM, "Benjamin Herrenschmidt" wrote: > On Mon, 2011-08-22 at 13:29 -0700, aafabbri wrote: > >>> Each device fd would then support a >>> similar set of ioctls and mapping (mmio/pio/config) interface as current >>> vfio, except for the obvious domain and dma ioctls superseded by

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Benjamin Herrenschmidt
> I am in favour of /dev/vfio/$GROUP. If multiple devices should be > assigned to a guest, there can also be an ioctl to bind a group to an > address-space of another group (certainly needs some care to not allow > that both groups belong to different processes). > > Btw, a problem we havn't talk

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Benjamin Herrenschmidt
On Mon, 2011-08-22 at 09:45 -0600, Alex Williamson wrote: > Yes, that's the idea. An open question I have towards the configuration > side is whether we might add iommu driver specific options to the > groups. For instance on x86 where we typically have B:D.F granularity, > should we have an opt

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Benjamin Herrenschmidt
On Mon, 2011-08-22 at 09:30 +0300, Avi Kivity wrote: > On 08/20/2011 07:51 PM, Alex Williamson wrote: > > We need to address both the description and enforcement of device > > groups. Groups are formed any time the iommu does not have resolution > > between a set of devices. On x86, this typicall

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Benjamin Herrenschmidt
On Mon, 2011-08-22 at 13:29 -0700, aafabbri wrote: > > Each device fd would then support a > > similar set of ioctls and mapping (mmio/pio/config) interface as current > > vfio, except for the obvious domain and dma ioctls superseded by the > > group fd. > > > > Another valid model might be that

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread aafabbri
On 8/20/11 9:51 AM, "Alex Williamson" wrote: > We had an extremely productive VFIO BoF on Monday. Here's my attempt to > capture the plan that I think we agreed to: > > We need to address both the description and enforcement of device > groups. Groups are formed any time the iommu does not

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Alex Williamson
On Mon, 2011-08-22 at 19:25 +0200, Joerg Roedel wrote: > On Sat, Aug 20, 2011 at 12:51:39PM -0400, Alex Williamson wrote: > > We had an extremely productive VFIO BoF on Monday. Here's my attempt to > > capture the plan that I think we agreed to: > > > > We need to address both the description and

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Joerg Roedel
On Sat, Aug 20, 2011 at 12:51:39PM -0400, Alex Williamson wrote: > We had an extremely productive VFIO BoF on Monday. Here's my attempt to > capture the plan that I think we agreed to: > > We need to address both the description and enforcement of device > groups. Groups are formed any time the

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Alex Williamson
On Mon, 2011-08-22 at 15:55 +1000, David Gibson wrote: > On Sat, Aug 20, 2011 at 09:51:39AM -0700, Alex Williamson wrote: > > We had an extremely productive VFIO BoF on Monday. Here's my attempt to > > capture the plan that I think we agreed to: > > > > We need to address both the description and

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Roedel, Joerg
On Mon, Aug 22, 2011 at 09:17:41AM -0400, Avi Kivity wrote: > On 08/22/2011 04:15 PM, Roedel, Joerg wrote: > > On Mon, Aug 22, 2011 at 09:06:07AM -0400, Avi Kivity wrote: > > > On 08/22/2011 03:55 PM, Roedel, Joerg wrote: > > > > > > Well, I don't think its really meaningless, but we need some w

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Avi Kivity
On 08/22/2011 04:15 PM, Roedel, Joerg wrote: On Mon, Aug 22, 2011 at 09:06:07AM -0400, Avi Kivity wrote: > On 08/22/2011 03:55 PM, Roedel, Joerg wrote: > > Well, I don't think its really meaningless, but we need some way to > > communicate the information about device groups to userspace. >

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Roedel, Joerg
On Mon, Aug 22, 2011 at 09:06:07AM -0400, Avi Kivity wrote: > On 08/22/2011 03:55 PM, Roedel, Joerg wrote: > > Well, I don't think its really meaningless, but we need some way to > > communicate the information about device groups to userspace. > > I mean the contents of the group descriptor. Th

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Avi Kivity
On 08/22/2011 03:55 PM, Roedel, Joerg wrote: On Mon, Aug 22, 2011 at 08:42:35AM -0400, Avi Kivity wrote: > On 08/22/2011 03:36 PM, Roedel, Joerg wrote: > > On the AMD IOMMU side this information is stored in the IVRS ACPI table. > > Not sure about the VT-d side, though. > > I see. There is

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Roedel, Joerg
On Mon, Aug 22, 2011 at 08:42:35AM -0400, Avi Kivity wrote: > On 08/22/2011 03:36 PM, Roedel, Joerg wrote: > > On the AMD IOMMU side this information is stored in the IVRS ACPI table. > > Not sure about the VT-d side, though. > > I see. There is no sysfs node representing it? No. It also doesn't

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Avi Kivity
On 08/22/2011 03:36 PM, Roedel, Joerg wrote: On Mon, Aug 22, 2011 at 06:51:35AM -0400, Avi Kivity wrote: > On 08/22/2011 01:46 PM, Joerg Roedel wrote: > > That does not work. The bridge in question may not even be visible as a > > PCI device, so you can't link to it. This is the case on a fe

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Roedel, Joerg
On Mon, Aug 22, 2011 at 06:51:35AM -0400, Avi Kivity wrote: > On 08/22/2011 01:46 PM, Joerg Roedel wrote: > > That does not work. The bridge in question may not even be visible as a > > PCI device, so you can't link to it. This is the case on a few PCIe > > cards which only have a PCIx chip and a P

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Avi Kivity
On 08/22/2011 01:46 PM, Joerg Roedel wrote: > $ readlink /sys/devices/pci:00/:00:19.0/iommu_group > ../../../path/to/device/which/represents/the/resource/constraint > > (the pci-to-pci bridge on x86, or whatever node represents partitionable > endpoints on power) That does not work.

Re: kvm PCI assignment & VFIO ramblings

2011-08-22 Thread Joerg Roedel
On Mon, Aug 22, 2011 at 02:30:26AM -0400, Avi Kivity wrote: > On 08/20/2011 07:51 PM, Alex Williamson wrote: > > We need to address both the description and enforcement of device > > groups. Groups are formed any time the iommu does not have resolution > > between a set of devices. On x86, this t

Re: kvm PCI assignment & VFIO ramblings

2011-08-21 Thread Avi Kivity
On 08/20/2011 07:51 PM, Alex Williamson wrote: We need to address both the description and enforcement of device groups. Groups are formed any time the iommu does not have resolution between a set of devices. On x86, this typically happens when a PCI-to-PCI bridge exists between the set of devi

Re: kvm PCI assignment & VFIO ramblings

2011-08-21 Thread David Gibson
On Sat, Aug 20, 2011 at 09:51:39AM -0700, Alex Williamson wrote: > We had an extremely productive VFIO BoF on Monday. Here's my attempt to > capture the plan that I think we agreed to: > > We need to address both the description and enforcement of device > groups. Groups are formed any time the

Re: kvm PCI assignment & VFIO ramblings

2011-08-20 Thread Alex Williamson
We had an extremely productive VFIO BoF on Monday. Here's my attempt to capture the plan that I think we agreed to: We need to address both the description and enforcement of device groups. Groups are formed any time the iommu does not have resolution between a set of devices. On x86, this typi

Re: kvm PCI assignment & VFIO ramblings

2011-08-09 Thread Benjamin Herrenschmidt
> Mostly correct, yes. x86 isn't immune to the group problem, it shows up > for us any time there's a PCIe-to-PCI bridge in the device hierarchy. > We lose resolution of devices behind the bridge. As you state though, I > think of this as only a constraint on what we're able to do with those > d

Re: kvm PCI assignment & VFIO ramblings

2011-08-09 Thread Alex Williamson
On Mon, 2011-08-08 at 11:28 +0300, Avi Kivity wrote: > On 08/03/2011 05:04 AM, David Gibson wrote: > > I still don't understand the distinction you're making. We're saying > > the group is "owned" by a given user or guest in the sense that no-one > > else may use anything in the group (including h

Re: kvm PCI assignment & VFIO ramblings

2011-08-08 Thread Avi Kivity
On 08/03/2011 05:04 AM, David Gibson wrote: I still don't understand the distinction you're making. We're saying the group is "owned" by a given user or guest in the sense that no-one else may use anything in the group (including host drivers). At that point none, some or all of the devices in

Re: kvm PCI assignment & VFIO ramblings

2011-08-07 Thread David Gibson
On Fri, Aug 05, 2011 at 09:10:09AM -0600, Alex Williamson wrote: > On Fri, 2011-08-05 at 20:42 +1000, Benjamin Herrenschmidt wrote: > > Right. In fact to try to clarify the problem for everybody, I think we > > can distinguish two different classes of "constraints" that can > > influence the groupi

Re: kvm PCI assignment & VFIO ramblings

2011-08-05 Thread Benjamin Herrenschmidt
On Fri, 2011-08-05 at 15:44 +0200, Joerg Roedel wrote: > On Fri, Aug 05, 2011 at 08:42:38PM +1000, Benjamin Herrenschmidt wrote: > > > Right. In fact to try to clarify the problem for everybody, I think we > > can distinguish two different classes of "constraints" that can > > influence the groupi

Re: kvm PCI assignment & VFIO ramblings

2011-08-05 Thread Alex Williamson
On Fri, 2011-08-05 at 20:42 +1000, Benjamin Herrenschmidt wrote: > Right. In fact to try to clarify the problem for everybody, I think we > can distinguish two different classes of "constraints" that can > influence the grouping of devices: > > 1- Hard constraints. These are typically devices usi

Re: kvm PCI assignment & VFIO ramblings

2011-08-05 Thread Joerg Roedel
On Fri, Aug 05, 2011 at 08:42:38PM +1000, Benjamin Herrenschmidt wrote: > Right. In fact to try to clarify the problem for everybody, I think we > can distinguish two different classes of "constraints" that can > influence the grouping of devices: > > 1- Hard constraints. These are typically dev

Re: kvm PCI assignment & VFIO ramblings

2011-08-05 Thread Joerg Roedel
On Fri, Aug 05, 2011 at 08:26:11PM +1000, Benjamin Herrenschmidt wrote: > On Thu, 2011-08-04 at 12:41 +0200, Joerg Roedel wrote: > > On Mon, Aug 01, 2011 at 02:27:36PM -0600, Alex Williamson wrote: > > > It's not clear to me how we could skip it. With VT-d, we'd have to > > > implement an emulated

Re: kvm PCI assignment & VFIO ramblings

2011-08-05 Thread Benjamin Herrenschmidt
On Thu, 2011-08-04 at 12:27 +0200, Joerg Roedel wrote: > Hi Ben, > > thanks for your detailed introduction to the requirements for POWER. Its > good to know that the granularity problem is not x86-only. I'm happy to see your reply :-) I had the feeling I was a bit alone here... > On Sat, Jul 30,

Re: kvm PCI assignment & VFIO ramblings

2011-08-05 Thread Benjamin Herrenschmidt
On Thu, 2011-08-04 at 12:41 +0200, Joerg Roedel wrote: > On Mon, Aug 01, 2011 at 02:27:36PM -0600, Alex Williamson wrote: > > It's not clear to me how we could skip it. With VT-d, we'd have to > > implement an emulated interrupt remapper and hope that the guest picks > > unused indexes in the host

Re: kvm PCI assignment & VFIO ramblings

2011-08-04 Thread Joerg Roedel
On Mon, Aug 01, 2011 at 02:27:36PM -0600, Alex Williamson wrote: > It's not clear to me how we could skip it. With VT-d, we'd have to > implement an emulated interrupt remapper and hope that the guest picks > unused indexes in the host interrupt remapping table before it could do > anything useful

Re: kvm PCI assignment & VFIO ramblings

2011-08-04 Thread Joerg Roedel
On Sat, Jul 30, 2011 at 12:20:08PM -0600, Alex Williamson wrote: > On Sat, 2011-07-30 at 09:58 +1000, Benjamin Herrenschmidt wrote: > > - The -minimum- granularity of pass-through is not always a single > > device and not always under SW control > > But IMHO, we need to preserve the granularity of

Re: kvm PCI assignment & VFIO ramblings

2011-08-04 Thread Joerg Roedel
Hi Ben, thanks for your detailed introduction to the requirements for POWER. Its good to know that the granularity problem is not x86-only. On Sat, Jul 30, 2011 at 09:58:53AM +1000, Benjamin Herrenschmidt wrote: > In IBM POWER land, we call this a "partitionable endpoint" (the term > "endpoint" h

Re: kvm PCI assignment & VFIO ramblings

2011-08-03 Thread David Gibson
On Tue, Aug 02, 2011 at 09:44:49PM -0600, Alex Williamson wrote: > On Wed, 2011-08-03 at 12:04 +1000, David Gibson wrote: > > On Tue, Aug 02, 2011 at 12:35:19PM -0600, Alex Williamson wrote: > > > On Tue, 2011-08-02 at 12:14 -0600, Alex Williamson wrote: > > > > On Tue, 2011-08-02 at 18:28 +1000, D

Re: kvm PCI assignment & VFIO ramblings

2011-08-02 Thread Alex Williamson
On Wed, 2011-08-03 at 12:04 +1000, David Gibson wrote: > On Tue, Aug 02, 2011 at 12:35:19PM -0600, Alex Williamson wrote: > > On Tue, 2011-08-02 at 12:14 -0600, Alex Williamson wrote: > > > On Tue, 2011-08-02 at 18:28 +1000, David Gibson wrote: > > > > On Sat, Jul 30, 2011 at 12:20:08PM -0600, Alex

Re: kvm PCI assignment & VFIO ramblings

2011-08-02 Thread David Gibson
On Tue, Aug 02, 2011 at 12:35:19PM -0600, Alex Williamson wrote: > On Tue, 2011-08-02 at 12:14 -0600, Alex Williamson wrote: > > On Tue, 2011-08-02 at 18:28 +1000, David Gibson wrote: > > > On Sat, Jul 30, 2011 at 12:20:08PM -0600, Alex Williamson wrote: > > > > On Sat, 2011-07-30 at 09:58 +1000, B

Re: kvm PCI assignment & VFIO ramblings

2011-08-02 Thread Alex Williamson
On Tue, 2011-08-02 at 17:29 -0400, Konrad Rzeszutek Wilk wrote: > On Tue, Aug 02, 2011 at 09:34:58AM -0600, Alex Williamson wrote: > > On Tue, 2011-08-02 at 22:58 +1000, Benjamin Herrenschmidt wrote: > > > > > > Don't worry, it took me a while to get my head around the HW :-) SR-IOV > > > VFs will

Re: kvm PCI assignment & VFIO ramblings

2011-08-02 Thread Konrad Rzeszutek Wilk
On Tue, Aug 02, 2011 at 09:34:58AM -0600, Alex Williamson wrote: > On Tue, 2011-08-02 at 22:58 +1000, Benjamin Herrenschmidt wrote: > > > > Don't worry, it took me a while to get my head around the HW :-) SR-IOV > > VFs will generally not have limitations like that no, but on the other > > hand, t

Re: kvm PCI assignment & VFIO ramblings

2011-08-02 Thread Alex Williamson
On Tue, 2011-08-02 at 12:14 -0600, Alex Williamson wrote: > On Tue, 2011-08-02 at 18:28 +1000, David Gibson wrote: > > On Sat, Jul 30, 2011 at 12:20:08PM -0600, Alex Williamson wrote: > > > On Sat, 2011-07-30 at 09:58 +1000, Benjamin Herrenschmidt wrote: > > [snip] > > > On x86, the USB controllers

Re: kvm PCI assignment & VFIO ramblings

2011-08-02 Thread Alex Williamson
On Tue, 2011-08-02 at 18:28 +1000, David Gibson wrote: > On Sat, Jul 30, 2011 at 12:20:08PM -0600, Alex Williamson wrote: > > On Sat, 2011-07-30 at 09:58 +1000, Benjamin Herrenschmidt wrote: > [snip] > > On x86, the USB controllers don't typically live behind a PCIe-to-PCI > > bridge, so don't suff

Re: kvm PCI assignment & VFIO ramblings

2011-08-02 Thread Alex Williamson
On Tue, 2011-08-02 at 22:58 +1000, Benjamin Herrenschmidt wrote: > > Don't worry, it took me a while to get my head around the HW :-) SR-IOV > VFs will generally not have limitations like that no, but on the other > hand, they -will- still require 1 VF = 1 group, ie, you won't be able to > take a

Re: kvm PCI assignment & VFIO ramblings

2011-08-02 Thread Alex Williamson
On Tue, 2011-08-02 at 11:27 +1000, Benjamin Herrenschmidt wrote: > It's a shared address space. With a basic configuration on p7ioc for > example we have MMIO going from 3G to 4G (PCI side addresses). BARs > contain the normal PCI address there. But that 1G is divided in 128 > segments of equal siz

Re: kvm PCI assignment & VFIO ramblings

2011-08-02 Thread Avi Kivity
On 08/02/2011 03:58 PM, Benjamin Herrenschmidt wrote: > > > > What you mean 2-level is two passes through two trees (ie 6 or 8 levels > > right ?). > > (16 or 25) 25 levels ? You mean 25 loads to get to a translation ? And you get any kind of performance out of that ? :-) Aggressive par

  1   2   >