Re: [libvirt] Fwd: In Use tracker for network and pci-passthrough devices: Laine response

2012-06-28 Thread Eric Blake
On 06/28/2012 05:21 AM, Shradha Shah wrote:

>> In the meantime, I think the right way to do this is by integrating with
>> the code in the qemu driver that keeps track of which PCI devices are in
>> use. This already happens at the very basic level of "if the device
>> allocated by the network driver is in use, the attempt to assign the
>> device will fail"; instead, the network driver should be able to ask
>> qemu if the device it wants to allocate to the guest is already in use
>> (and reserve it, in one atomic operation).
>>
>> Of course, once the network driver has reserved the device from qemu's
>> PCI passthrough code, it would return that device to the qemu driver
>> code that wants to attach the interface, and it would fail because it
>> would be told the device is already in use (well, yeah! *We* just marked
>> it as in-use!). To make that work, I guess some sort of
>> cookie/handle/pointer would need to be passed from qemu's pci
>> passthrough code back to the network driver, and the network driver
>> would return it back to qemu's network interface attach code, which
>> would then use that special cookie/handle/pointer to attach the device
>> (saying "yeah, I know it's already in use, and here's my pass-card").
> 
> Wouldn't this approach require network driver to call functions from the
> qemu driver?
> I think this is not good for the hierarchical structure we are trying to 
> maintain. 

Agreed, we need to move device tracking out of qemu and into common
reusable code.

> 
>>
>> (Talking about this makes me think that the code that keeps track of PCI
>> device allocation shouldn't really be a part of qemu, but should be a
>> separate module, so that the network driver can still function properly
>> even if the qemu driver isn't loaded.)
> 
> Would this mean moving code to a new driver called device_driver.c or
> devicetracker_driver.c (which consumes device_conf.ch) and is called by 
> network, domain and qemu drivers?

Maybe even name it src/conf/nodedev_conf.[ch], since it deals with
handling of node devices.  But yes, the idea of a common file in
src/conf that can then be shared between network and qemu drivers makes
sense.

-- 
Eric Blake   ebl...@redhat.com+1-919-301-3266
Libvirt virtualization library http://libvirt.org





signature.asc
Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Fwd: In Use tracker for network and pci-passthrough devices: Laine response

2012-06-28 Thread Shradha Shah
Osier, Many thanks for your input.
Comments inline.

On 06/28/2012 11:48 AM, Shradha Shah wrote:
> This is a reply from Osier Yang
> 
> On 2012年06月27日 04:02, Laine Stump wrote:
>> (NB: I'm Cc'ing Osier on this email, as he's quite knowledgeable about
>> the PCI passthrough device allocation tracking code. You should probably
>> move this discussion to the mailing list sooner rather than later
>> though, as a public discussion of the design will give you a better
>> chance of your first revision getting successfully past review :-))
>>
>> On 06/26/2012 07:23 AM, Shradha Shah wrote:
>>> Laine,
>>>
>>> I have submitted my v2 patches for forward mode='hostdev' and am planning 
>>> to work on the in-use tracker for network
>>> and pci-passthrough devices.
>>>
>>> I am unable to wrap my head around how I should be implementing this 
>>> functionality. I am unable to decide at what
>>> level I should be implementing this (network, domain or qemu).
>>>
>>> May I ask for your guidance in order to implement this functionality?
>>>
>>
>> Yes, but I'm currently on vacation (in Turkey) so I won't have much time
>> to respond until July 9 when I return.
>>
>> In the meantime, I think the right way to do this is by integrating with
>> the code in the qemu driver that keeps track of which PCI devices are in
>> use. This already happens at the very basic level of "if the device
>> allocated by the network driver is in use, the attempt to assign the
>> device will fail"; instead, the network driver should be able to ask
>> qemu if the device it wants to allocate to the guest is already in use
>> (and reserve it, in one atomic operation).
> 
> Hi, Shradha, Laine,
> 
> I have not read your patches for "forward=hostdev" carefully, so
> not sure if I can give right direction, but let me try:
> 
> It looks like what you will do is just reserve the vf or pf from host,
> and when the vf/pf is attached to domain or used in other ways, you
> want it to be marked as in-use, am I correct?
> 
> If so, it should be not hard to do, for each PCI device, we have a
> field named "used_by", to stores the domain name which uses it, and in
> qemu driver, we have two list "activePciHostdevs", "inactivePciHostdevs"
> of pciDeviceList type.
> 
> "activePciHostdevs" holds the PCI devices which are in used by all
> the qemu domains, and "inactivePciHostdevs" holds the PCI devices
> detached from the host, and not used by any domain. Basicly the purpose
> of "inactivePciHostdevs" is to resolve the problem of pci device
> resetting on two PCI devices share the same bus. See commit 6be610bf
> for more details.
> 
> So that means,  updating the "used_by" field of the pci device,
> "activePciHostdevs", and "inactivePciHostdevs" all happens
> while attaching the interface to domain, or detaching it from the
> domain, or when domain starting, or when the domain is shutdown.
> 
> E.g, attaching the interface to domain (assuming the attachment
> succeeded), it needs to do:
> 
> 1) Set "used_by" as the domain name
> 2) Insert the device to "activePciHostdevs" list.
> 3) Remove the device from "inactivePciHostdevs" list if it was
>there.
> 
> Porcess of detaching is just opposite with above. However, the
> whole process is much more complicated than the 3 listed steps.

This approach is easier to implement but this would mean that we
have to access the qemu driver from the network driver since we need
to make a decision about device usage in networkAllocateActualDevice.

This messes with the hierarchy I think. 

> 
> I found you introduce new members for virNetworkForwardIfDef:
> 
>  struct _virNetworkForwardIfDef {
> -char *dev;  /* name of device */
> +int type;
> +union {
> +virDevicePCIAddress pci; /*PCI Address of device */
> +/* when USB devices are supported a new variable to be added here */
> +char *dev;  /* name of device */
> +}device;
> +int usageCount; /* how many guest interfaces are bound to this device? */
> +};
> 
> So why don't use pciDevice. e.g.
> 
>  struct _virNetworkForwardIfDef {
> char *dev;  /* name of device */
> int type;
> union {
> pciDevice pci; /*PCI Address of device */
> /* when USB devices are supported a new variable to be added here */
> char *dev;  /* name of device */
> } device;
> int usageCount; /* how many guest interfaces are bound to this device? */
> };
> 
> You can add usbDevice there once it's supported. That means
> you can reuse the existed codes for pci and devices management
> of qemu driver.
> 

I was thinking of having a new driver called devicetracker_driver.c that
consumes device_conf.ch and is used in domain, network and qemu drivers.

>>
>> Of course, once the network driver has reserved the device from qemu's
>> PCI passthrough code, it would return that device to the qemu driver
>> cod

Re: [libvirt] Fwd: In Use tracker for network and pci-passthrough devices: Laine response

2012-06-28 Thread Shradha Shah
On 06/28/2012 11:33 AM, Shradha Shah wrote:
> This is a reply I got from Laine Stump
> =
> 
> (NB: I'm Cc'ing Osier on this email, as he's quite knowledgeable about
> the PCI passthrough device allocation tracking code. You should probably
> move this discussion to the mailing list sooner rather than later
> though, as a public discussion of the design will give you a better
> chance of your first revision getting successfully past review :-))
> 
> On 06/26/2012 07:23 AM, Shradha Shah wrote:
>>> Laine,
>>>
>>> I have submitted my v2 patches for forward mode='hostdev' and am planning 
>>> to work on the in-use tracker for network
>>> and pci-passthrough devices.
>>>
>>> I am unable to wrap my head around how I should be implementing this 
>>> functionality. I am unable to decide at what 
>>> level I should be implementing this (network, domain or qemu).
>>>
>>> May I ask for your guidance in order to implement this functionality? 
>>>
> Yes, but I'm currently on vacation (in Turkey) so I won't have much time
> to respond until July 9 when I return.
> 
> In the meantime, I think the right way to do this is by integrating with
> the code in the qemu driver that keeps track of which PCI devices are in
> use. This already happens at the very basic level of "if the device
> allocated by the network driver is in use, the attempt to assign the
> device will fail"; instead, the network driver should be able to ask
> qemu if the device it wants to allocate to the guest is already in use
> (and reserve it, in one atomic operation).
> 
> Of course, once the network driver has reserved the device from qemu's
> PCI passthrough code, it would return that device to the qemu driver
> code that wants to attach the interface, and it would fail because it
> would be told the device is already in use (well, yeah! *We* just marked
> it as in-use!). To make that work, I guess some sort of
> cookie/handle/pointer would need to be passed from qemu's pci
> passthrough code back to the network driver, and the network driver
> would return it back to qemu's network interface attach code, which
> would then use that special cookie/handle/pointer to attach the device
> (saying "yeah, I know it's already in use, and here's my pass-card").

Wouldn't this approach require network driver to call functions from the
qemu driver?
I think this is not good for the hierarchical structure we are trying to 
maintain. 

> 
> (Talking about this makes me think that the code that keeps track of PCI
> device allocation shouldn't really be a part of qemu, but should be a
> separate module, so that the network driver can still function properly
> even if the qemu driver isn't loaded.)

Would this mean moving code to a new driver called device_driver.c or
devicetracker_driver.c (which consumes device_conf.ch) and is called by 
network, domain and qemu drivers?

If so, I like this approach.

> 
> Another twist to this that should be considered - if any particular
> device is in use by at least one guest for one of the macvtap modes,
> that device also needs to be marked as in-use in libvirt's pci device
> table - it would be disastrous if another guest decided to use that
> device for standard PCI Passthrough.

Agreed.

> 
> (Keep in mind that I wrote everything above without even once looking at
> the code or any other reference, so you should take it with a grain of
> salt!)
> 
> 
> 
> Many Thanks,
> Regards,
> Shradha Shah
> 
> On 06/28/2012 11:19 AM, Shradha Shah wrote:
>> This is a conversation that I started with Laine Stump for the 
>> implementation of the in-use tracker for network and pci devices.
>>
>> I want to make this conversation more public in order to receive everyone's 
>> view on the topic.
>>
>> I will also post the responses I got from Laine and Osier Yang.
>>
>> Many Thanks,
>> Regards,
>> Shradha Shah
>>
>>
>>  Original Message 
>> Subject: In Use tracker for network and pci-passthrough devices
>> Date: Tue, 26 Jun 2012 12:23:52 +0100
>> From: Shradha Shah 
>> To: Laine Stump 
>>
>> Laine,
>>
>> I have submitted my v2 patches for forward mode='hostdev' and am planning to 
>> work on the in-use tracker for network
>> and pci-passthrough devices.
>>
>> I am unable to wrap my head around how I should be implementing this 
>> functionality. I am unable to decide at what 
>> level I should be implementing this (network, domain or qemu).
>>
>> May I ask for your guidance in order to implement this functionality? 
>>
> 
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Fwd: In Use tracker for network and pci-passthrough devices: Laine response

2012-06-28 Thread Shradha Shah
On 06/27/2012 09:03 AM, Osier Yang wrote:
> > On 2012年06月27日 04:02, Laine Stump wrote:
>> >> (NB: I'm Cc'ing Osier on this email, as he's quite knowledgeable about
>> >> the PCI passthrough device allocation tracking code. You should probably
>> >> move this discussion to the mailing list sooner rather than later
>> >> though, as a public discussion of the design will give you a better
>> >> chance of your first revision getting successfully past review :-))
>> >>
>> >> On 06/26/2012 07:23 AM, Shradha Shah wrote:
>>> >>> Laine,
>>> >>>
>>> >>> I have submitted my v2 patches for forward mode='hostdev' and am
>>> >>> planning to work on the in-use tracker for network
>>> >>> and pci-passthrough devices.
>>> >>>
>>> >>> I am unable to wrap my head around how I should be implementing this
>>> >>> functionality. I am unable to decide at what
>>> >>> level I should be implementing this (network, domain or qemu).
>>> >>>
>>> >>> May I ask for your guidance in order to implement this functionality?
>>> >>>
>> >>
>> >> Yes, but I'm currently on vacation (in Turkey) so I won't have much time
>> >> to respond until July 9 when I return.
>> >>
>> >> In the meantime, I think the right way to do this is by integrating with
>> >> the code in the qemu driver that keeps track of which PCI devices are in
>> >> use. This already happens at the very basic level of "if the device
>> >> allocated by the network driver is in use, the attempt to assign the
>> >> device will fail"; instead, the network driver should be able to ask
>> >> qemu if the device it wants to allocate to the guest is already in use
>> >> (and reserve it, in one atomic operation).
> >
> > Hi, Shradha, Laine,
> >
> > I have not read your patches for "forward=hostdev" carefully, so
> > not sure if I can give right direction, but let me try:
> >
> > It looks like what you will do is just reserve the vf or pf from host,
> > and when the vf/pf is attached to domain or used in other ways, you
> > want it to be marked as in-use, am I correct?
Correct. Currently the network driver picks a device from its pool and
returns it to qemu having no idea if maybe that device is already used
in some other way. By the time we get back to qemu and learn that the
device is already used, the best we can do is fail, which is "less than
ideal" :-)

> >
> > If so, it should be not hard to do, for each PCI device, we have a
> > field named "used_by", to stores the domain name which uses it, and in
> > qemu driver, we have two list "activePciHostdevs", "inactivePciHostdevs"
> > of pciDeviceList type.
> >
> > "activePciHostdevs" holds the PCI devices which are in used by all
> > the qemu domains, and "inactivePciHostdevs" holds the PCI devices
> > detached from the host, and not used by any domain. Basicly the purpose
> > of "inactivePciHostdevs" is to resolve the problem of pci device
> > resetting on two PCI devices share the same bus. See commit 6be610bf
> > for more details.
> >
> > So that means,  updating the "used_by" field of the pci device,
> > "activePciHostdevs", and "inactivePciHostdevs" all happens
> > while attaching the interface to domain, or detaching it from the
> > domain, or when domain starting, or when the domain is shutdown.
> >
> > E.g, attaching the interface to domain (assuming the attachment
> > succeeded), it needs to do:
> >
> > 1) Set "used_by" as the domain name
> > 2) Insert the device to "activePciHostdevs" list.
> > 3) Remove the device from "inactivePciHostdevs" list if it was
> >there.
The trick is to do enough of that in networkAllocateActualDevice to
assure that 1) the device won't be used by someone else, 2) the guest
that's grabbing the device *can* use it, and 3) "the right thing" will
happen if libvirtd is restarted sometime after the device is "reserved"
but before the guest is started.

> >
> >
> > Porcess of detaching is just opposite with above. However, the
> > whole process is much more complicated than the 3 listed steps.
> >
> > I found you introduce new members for virNetworkForwardIfDef:
> >
> >  struct _virNetworkForwardIfDef {
> > -char *dev;  /* name of device */
> > +int type;
> > +union {
> > +virDevicePCIAddress pci; /*PCI Address of device */
> > +/* when USB devices are supported a new variable to be added
> > here */
> > +char *dev;  /* name of device */
> > +}device;
> > +int usageCount; /* how many guest interfaces are bound to this
> > device? */
> > +};
> >
> > So why don't use pciDevice. e.g.
In general I think it would be a good idea to unify pciDevice,
virDevicePCIAddress, and pci_config_address as much as possible, but
pciDevice itself has a lot of fields that don't make sense in a
configuration object, and anyway currently all the other conf code
(including hostdev definitions) uses virDevicePCIAddress, and there is
already code to parse/format to/from a virDevicePCIAddress. As a matter
of fact, pciDevice is defined in pci.c, so it can't be used anywhere
else,

Re: [libvirt] Fwd: In Use tracker for network and pci-passthrough devices: Laine response

2012-06-28 Thread Shradha Shah
This is a reply from Osier Yang

On 2012年06月27日 04:02, Laine Stump wrote:
> (NB: I'm Cc'ing Osier on this email, as he's quite knowledgeable about
> the PCI passthrough device allocation tracking code. You should probably
> move this discussion to the mailing list sooner rather than later
> though, as a public discussion of the design will give you a better
> chance of your first revision getting successfully past review :-))
>
> On 06/26/2012 07:23 AM, Shradha Shah wrote:
>> Laine,
>>
>> I have submitted my v2 patches for forward mode='hostdev' and am planning to 
>> work on the in-use tracker for network
>> and pci-passthrough devices.
>>
>> I am unable to wrap my head around how I should be implementing this 
>> functionality. I am unable to decide at what
>> level I should be implementing this (network, domain or qemu).
>>
>> May I ask for your guidance in order to implement this functionality?
>>
>
> Yes, but I'm currently on vacation (in Turkey) so I won't have much time
> to respond until July 9 when I return.
>
> In the meantime, I think the right way to do this is by integrating with
> the code in the qemu driver that keeps track of which PCI devices are in
> use. This already happens at the very basic level of "if the device
> allocated by the network driver is in use, the attempt to assign the
> device will fail"; instead, the network driver should be able to ask
> qemu if the device it wants to allocate to the guest is already in use
> (and reserve it, in one atomic operation).

Hi, Shradha, Laine,

I have not read your patches for "forward=hostdev" carefully, so
not sure if I can give right direction, but let me try:

It looks like what you will do is just reserve the vf or pf from host,
and when the vf/pf is attached to domain or used in other ways, you
want it to be marked as in-use, am I correct?

If so, it should be not hard to do, for each PCI device, we have a
field named "used_by", to stores the domain name which uses it, and in
qemu driver, we have two list "activePciHostdevs", "inactivePciHostdevs"
of pciDeviceList type.

"activePciHostdevs" holds the PCI devices which are in used by all
the qemu domains, and "inactivePciHostdevs" holds the PCI devices
detached from the host, and not used by any domain. Basicly the purpose
of "inactivePciHostdevs" is to resolve the problem of pci device
resetting on two PCI devices share the same bus. See commit 6be610bf
for more details.

So that means,  updating the "used_by" field of the pci device,
"activePciHostdevs", and "inactivePciHostdevs" all happens
while attaching the interface to domain, or detaching it from the
domain, or when domain starting, or when the domain is shutdown.

E.g, attaching the interface to domain (assuming the attachment
succeeded), it needs to do:

1) Set "used_by" as the domain name
2) Insert the device to "activePciHostdevs" list.
3) Remove the device from "inactivePciHostdevs" list if it was
   there.

Porcess of detaching is just opposite with above. However, the
whole process is much more complicated than the 3 listed steps.

I found you introduce new members for virNetworkForwardIfDef:

 struct _virNetworkForwardIfDef {
-char *dev;  /* name of device */
+int type;
+union {
+virDevicePCIAddress pci; /*PCI Address of device */
+/* when USB devices are supported a new variable to be added here */
+char *dev;  /* name of device */
+}device;
+int usageCount; /* how many guest interfaces are bound to this device? */
+};

So why don't use pciDevice. e.g.

 struct _virNetworkForwardIfDef {
char *dev;  /* name of device */
int type;
union {
pciDevice pci; /*PCI Address of device */
/* when USB devices are supported a new variable to be added here */
char *dev;  /* name of device */
} device;
int usageCount; /* how many guest interfaces are bound to this device? */
};

You can add usbDevice there once it's supported. That means
you can reuse the existed codes for pci and devices management
of qemu driver.

>
> Of course, once the network driver has reserved the device from qemu's
> PCI passthrough code, it would return that device to the qemu driver
> code that wants to attach the interface, and it would fail because it
> would be told the device is already in use (well, yeah! *We* just marked
> it as in-use!). To make that work, I guess some sort of
> cookie/handle/pointer would need to be passed from qemu's pci
> passthrough code back to the network driver, and the network driver
> would return it back to qemu's network interface attach code, which
> would then use that special cookie/handle/pointer to attach the device
> (saying "yeah, I know it's already in use, and here's my pass-card").
>
> (Talking about this makes me think that the code that keeps track of PCI
> device allocation shouldn't really be a p

Re: [libvirt] Fwd: In Use tracker for network and pci-passthrough devices: Laine response

2012-06-28 Thread Shradha Shah
This is a reply I got from Laine Stump
=

(NB: I'm Cc'ing Osier on this email, as he's quite knowledgeable about
the PCI passthrough device allocation tracking code. You should probably
move this discussion to the mailing list sooner rather than later
though, as a public discussion of the design will give you a better
chance of your first revision getting successfully past review :-))

On 06/26/2012 07:23 AM, Shradha Shah wrote:
> > Laine,
> >
> > I have submitted my v2 patches for forward mode='hostdev' and am planning 
> > to work on the in-use tracker for network
> > and pci-passthrough devices.
> >
> > I am unable to wrap my head around how I should be implementing this 
> > functionality. I am unable to decide at what 
> > level I should be implementing this (network, domain or qemu).
> >
> > May I ask for your guidance in order to implement this functionality? 
> >
Yes, but I'm currently on vacation (in Turkey) so I won't have much time
to respond until July 9 when I return.

In the meantime, I think the right way to do this is by integrating with
the code in the qemu driver that keeps track of which PCI devices are in
use. This already happens at the very basic level of "if the device
allocated by the network driver is in use, the attempt to assign the
device will fail"; instead, the network driver should be able to ask
qemu if the device it wants to allocate to the guest is already in use
(and reserve it, in one atomic operation).

Of course, once the network driver has reserved the device from qemu's
PCI passthrough code, it would return that device to the qemu driver
code that wants to attach the interface, and it would fail because it
would be told the device is already in use (well, yeah! *We* just marked
it as in-use!). To make that work, I guess some sort of
cookie/handle/pointer would need to be passed from qemu's pci
passthrough code back to the network driver, and the network driver
would return it back to qemu's network interface attach code, which
would then use that special cookie/handle/pointer to attach the device
(saying "yeah, I know it's already in use, and here's my pass-card").

(Talking about this makes me think that the code that keeps track of PCI
device allocation shouldn't really be a part of qemu, but should be a
separate module, so that the network driver can still function properly
even if the qemu driver isn't loaded.)

Another twist to this that should be considered - if any particular
device is in use by at least one guest for one of the macvtap modes,
that device also needs to be marked as in-use in libvirt's pci device
table - it would be disastrous if another guest decided to use that
device for standard PCI Passthrough.

(Keep in mind that I wrote everything above without even once looking at
the code or any other reference, so you should take it with a grain of
salt!)



Many Thanks,
Regards,
Shradha Shah

On 06/28/2012 11:19 AM, Shradha Shah wrote:
> This is a conversation that I started with Laine Stump for the implementation 
> of the in-use tracker for network and pci devices.
> 
> I want to make this conversation more public in order to receive everyone's 
> view on the topic.
> 
> I will also post the responses I got from Laine and Osier Yang.
> 
> Many Thanks,
> Regards,
> Shradha Shah
> 
> 
>  Original Message 
> Subject: In Use tracker for network and pci-passthrough devices
> Date: Tue, 26 Jun 2012 12:23:52 +0100
> From: Shradha Shah 
> To: Laine Stump 
> 
> Laine,
> 
> I have submitted my v2 patches for forward mode='hostdev' and am planning to 
> work on the in-use tracker for network
> and pci-passthrough devices.
> 
> I am unable to wrap my head around how I should be implementing this 
> functionality. I am unable to decide at what 
> level I should be implementing this (network, domain or qemu).
> 
> May I ask for your guidance in order to implement this functionality? 
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list