[PATCH] vfio: Whitelist PCI bridges

2015-09-11 Thread Alex Williamson
When determining whether a group is viable, we already allow devices
bound to pcieport.  Generalize this to include any PCI bridge device.

Signed-off-by: Alex Williamson 
---
 drivers/vfio/vfio.c |   31 +--
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 563c510..1c0f98c 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -438,16 +439,33 @@ static struct vfio_device *vfio_group_get_device(struct 
vfio_group *group,
 }
 
 /*
- * Whitelist some drivers that we know are safe (no dma) or just sit on
- * a device.  It's not always practical to leave a device within a group
- * driverless as it could get re-bound to something unsafe.
+ * Some drivers, like pci-stub, are only used to prevent other drivers from
+ * claiming a device and are therefore perfectly legitimate for a user owned
+ * group.  The pci-stub driver has no dependencies on DMA or the IOVA mapping
+ * of the device, but it does prevent the user from having direct access to
+ * the device, which is useful in some circumstances.
+ *
+ * We also assume that we can include PCI interconnect devices, ie. bridges.
+ * IOMMU grouping on PCI necessitates that if we lack isolation on a bridge
+ * then all of the downstream devices will be part of the same IOMMU group as
+ * the bridge.  Thus, if placing the bridge into the user owned IOVA space
+ * breaks anything, it only does so for user owned devices downstream.  Note
+ * that error notification via MSI can be affected for platforms that handle
+ * MSI within the same IOVA space as DMA.
  */
-static const char * const vfio_driver_whitelist[] = { "pci-stub", "pcieport" };
+static const char * const vfio_driver_whitelist[] = { "pci-stub" };
 
-static bool vfio_whitelisted_driver(struct device_driver *drv)
+static bool vfio_dev_whitelisted(struct device *dev, struct device_driver *drv)
 {
int i;
 
+   if (dev_is_pci(dev)) {
+   struct pci_dev *pdev = to_pci_dev(dev);
+
+   if (pdev->hdr_type != PCI_HEADER_TYPE_NORMAL)
+   return true;
+   }
+
for (i = 0; i < ARRAY_SIZE(vfio_driver_whitelist); i++) {
if (!strcmp(drv->name, vfio_driver_whitelist[i]))
return true;
@@ -462,6 +480,7 @@ static bool vfio_whitelisted_driver(struct device_driver 
*drv)
  *  - driver-less
  *  - bound to a vfio driver
  *  - bound to a whitelisted driver
+ *  - a PCI interconnect device
  *
  * We use two methods to determine whether a device is bound to a vfio
  * driver.  The first is to test whether the device exists in the vfio
@@ -486,7 +505,7 @@ static int vfio_dev_viable(struct device *dev, void *data)
}
mutex_unlock(&group->unbound_lock);
 
-   if (!ret || !drv || vfio_whitelisted_driver(drv))
+   if (!ret || !drv || vfio_dev_whitelisted(dev, drv))
return 0;
 
device = vfio_group_get_device(group, dev);

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] device_isolation: Support isolation on POWER p7ioc (IODA) bridges

2012-02-01 Thread David Gibson
On Wed, Feb 01, 2012 at 12:17:05PM -0700, Alex Williamson wrote:
> On Wed, 2012-02-01 at 15:46 +1100, David Gibson wrote:
> > This patch adds code to the code for the powernv platform to create
> > and populate isolation groups on hardware using the p7ioc (aka IODA) PCI 
> > host
> > bridge used on some IBM POWER systems.
> > 
> > Signed-off-by: Alexey Kardashevskiy 
> > Signed-off-by: David Gibson 
> > ---
> >  arch/powerpc/platforms/powernv/pci-ioda.c |   18 --
> >  arch/powerpc/platforms/powernv/pci.h  |6 ++
> >  2 files changed, 22 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> > b/arch/powerpc/platforms/powernv/pci-ioda.c
> > index 5e155df..4648475 100644
> > --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> > @@ -20,6 +20,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  #include 
> > @@ -877,6 +878,9 @@ static void __devinit pnv_ioda_setup_bus_dma(struct 
> > pnv_ioda_pe *pe,
> > set_iommu_table_base(&dev->dev, &pe->tce32_table);
> > if (dev->subordinate)
> > pnv_ioda_setup_bus_dma(pe, dev->subordinate);
> > +#ifdef CONFIG_DEVICE_ISOLATION
> > +   device_isolation_dev_add(&pe->di_group, &dev->dev);
> > +#endif
> > }
> >  }
> >  
> > @@ -957,11 +961,21 @@ static void __devinit 
> > pnv_pci_ioda_setup_dma_pe(struct pnv_phb *phb,
> > }
> > iommu_init_table(tbl, phb->hose->node);
> >  
> > -   if (pe->pdev)
> > +#ifdef CONFIG_DEVICE_ISOLATION
> > +   BUG_ON(device_isolation_group_init(&pe->di_group, "ioda:rid%x-pe%x",
> > +  pe->rid, pe->pe_number) < 0);
> > +#endif
> > +
> > +   if (pe->pdev) {
> > set_iommu_table_base(&pe->pdev->dev, tbl);
> > -   else
> > +#ifdef CONFIG_DEVICE_ISOLATION
> > +   device_isolation_dev_add(&pe->di_group, &pe->pdev->dev);
> > +#endif
> > +   } else
> > pnv_ioda_setup_bus_dma(pe, pe->pbus);
> 
> Blech, #ifdefs.

Hm, yeah.  The problem is the di_group member not even existing when
!DEVICE_ISOLATION.  Might be able to avoid that with an empty
structure in that case.

> > +
> > +
> > return;
> >   fail:
> > /* XXX Failure: Try to fallback to 64-bit only ? */
> > diff --git a/arch/powerpc/platforms/powernv/pci.h 
> > b/arch/powerpc/platforms/powernv/pci.h
> > index 64ede1e..3e282b7 100644
> > --- a/arch/powerpc/platforms/powernv/pci.h
> > +++ b/arch/powerpc/platforms/powernv/pci.h
> > @@ -1,6 +1,8 @@
> >  #ifndef __POWERNV_PCI_H
> >  #define __POWERNV_PCI_H
> >  
> > +#include 
> > +
> >  struct pci_dn;
> >  
> >  enum pnv_phb_type {
> > @@ -60,6 +62,10 @@ struct pnv_ioda_pe {
> >  
> > /* Link in list of PE#s */
> > struct list_headlink;
> > +
> > +#ifdef CONFIG_DEVICE_ISOLATION
> > +   struct device_isolation_group di_group;
> > +#endif
> 
> Embedding the struct means we need to know the size, which means we
> can't get rid of the #ifdef.  Probably better to use a pointer if we
> don't mind adding a few bytes in the #ifndef case.  Thanks,

I've been back and forth a few types on this, and I've convinced
myself that allowing the group structure to be embedded is a better
idea.  It's a particular help when you need to construct one from
platform or bridge init code that runs before mem_init_done.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] device_isolation: Support isolation on POWER p7ioc (IODA) bridges

2012-02-01 Thread Alex Williamson
On Wed, 2012-02-01 at 15:46 +1100, David Gibson wrote:
> This patch adds code to the code for the powernv platform to create
> and populate isolation groups on hardware using the p7ioc (aka IODA) PCI host
> bridge used on some IBM POWER systems.
> 
> Signed-off-by: Alexey Kardashevskiy 
> Signed-off-by: David Gibson 
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c |   18 --
>  arch/powerpc/platforms/powernv/pci.h  |6 ++
>  2 files changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 5e155df..4648475 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -877,6 +878,9 @@ static void __devinit pnv_ioda_setup_bus_dma(struct 
> pnv_ioda_pe *pe,
>   set_iommu_table_base(&dev->dev, &pe->tce32_table);
>   if (dev->subordinate)
>   pnv_ioda_setup_bus_dma(pe, dev->subordinate);
> +#ifdef CONFIG_DEVICE_ISOLATION
> + device_isolation_dev_add(&pe->di_group, &dev->dev);
> +#endif
>   }
>  }
>  
> @@ -957,11 +961,21 @@ static void __devinit pnv_pci_ioda_setup_dma_pe(struct 
> pnv_phb *phb,
>   }
>   iommu_init_table(tbl, phb->hose->node);
>  
> - if (pe->pdev)
> +#ifdef CONFIG_DEVICE_ISOLATION
> + BUG_ON(device_isolation_group_init(&pe->di_group, "ioda:rid%x-pe%x",
> +pe->rid, pe->pe_number) < 0);
> +#endif
> +
> + if (pe->pdev) {
>   set_iommu_table_base(&pe->pdev->dev, tbl);
> - else
> +#ifdef CONFIG_DEVICE_ISOLATION
> + device_isolation_dev_add(&pe->di_group, &pe->pdev->dev);
> +#endif
> + } else
>   pnv_ioda_setup_bus_dma(pe, pe->pbus);
>  

Blech, #ifdefs.

> +
> +
>   return;
>   fail:
>   /* XXX Failure: Try to fallback to 64-bit only ? */
> diff --git a/arch/powerpc/platforms/powernv/pci.h 
> b/arch/powerpc/platforms/powernv/pci.h
> index 64ede1e..3e282b7 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -1,6 +1,8 @@
>  #ifndef __POWERNV_PCI_H
>  #define __POWERNV_PCI_H
>  
> +#include 
> +
>  struct pci_dn;
>  
>  enum pnv_phb_type {
> @@ -60,6 +62,10 @@ struct pnv_ioda_pe {
>  
>   /* Link in list of PE#s */
>   struct list_headlink;
> +
> +#ifdef CONFIG_DEVICE_ISOLATION
> + struct device_isolation_group di_group;
> +#endif

Embedding the struct means we need to know the size, which means we
can't get rid of the #ifdef.  Probably better to use a pointer if we
don't mind adding a few bytes in the #ifndef case.  Thanks,

Alex

>  };
>  
>  struct pnv_phb {



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] device_isolation: Support isolation on POWER p5ioc2 bridges

2012-02-01 Thread Alex Williamson
On Wed, 2012-02-01 at 11:58 -0700, Alex Williamson wrote:
> On Wed, 2012-02-01 at 15:46 +1100, David Gibson wrote:
> > This patch adds code to the code for the powernv platform to create
> > and populate isolation groups on hardware using the p5ioc2 PCI host
> > bridge used on some IBM POWER systems.
> > 
> > Signed-off-by: Alexey Kardashevskiy 
> > Signed-off-by: David Gibson 
> > ---
> >  arch/powerpc/platforms/powernv/pci-p5ioc2.c |   14 +-
> >  arch/powerpc/platforms/powernv/pci.h|3 +++
> >  2 files changed, 16 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pci-p5ioc2.c 
> > b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
> > index 2649677..e5bb3a6 100644
> > --- a/arch/powerpc/platforms/powernv/pci-p5ioc2.c
> > +++ b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
> > @@ -20,6 +20,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  
> >  #include 
> >  #include 
> > @@ -88,10 +89,21 @@ static void pnv_pci_init_p5ioc2_msis(struct pnv_phb 
> > *phb) { }
> >  static void __devinit pnv_pci_p5ioc2_dma_dev_setup(struct pnv_phb *phb,
> >struct pci_dev *pdev)
> >  {
> > -   if (phb->p5ioc2.iommu_table.it_map == NULL)
> > +   if (phb->p5ioc2.iommu_table.it_map == NULL) {
> > iommu_init_table(&phb->p5ioc2.iommu_table, phb->hose->node);
> > +#ifdef CONFIG_DEVICE_ISOLATION
> > +   phb->p5ioc2.di_group = kzalloc(sizeof(*(phb->p5ioc2.di_group)),
> > +  GFP_KERNEL);
> > +   BUG_ON(!phb->p5ioc2.di_group ||
> > +  (device_isolation_group_init(phb->p5ioc2.di_group,
> > +   "p5ioc2:%llx", 
> > phb->opal_id) < 0));
> > +#endif
> 
> Hmm, it's really unfortunate that this is architected so we need to
> surround everything in #ifdefs even though we have stub functions
> defined.

I think maybe we want:

#ifdef CONFIG_DEVICE_ISOLATION
struct device_isolation_group *device_isolation_create_group(void)
{
struct device_isolation_group *di_group;

di_group = kzalloc(sizeof(*di_group), GFP_KERNEL);
if (!di_group)
return ERR_PTR(-ENOMEM);

return di_group;
}
#else
struct device_isolation_group *device_isolation_create_group(void)
{
return NULL;
}
#endif

Then we can do:

phb->p5ioc2.di_group = device_isolation_create_group();
BUG_ON(IS_ERR(phb->p5ioc2.di_group) || 
(device_isolation_group_init(phb->p5ioc2.di_group, ...

(We pass NULL to the stubs, but that's ok)

> > +   }
> >  
> > set_iommu_table_base(&pdev->dev, &phb->p5ioc2.iommu_table);
> > +#ifdef CONFIG_DEVICE_ISOLATION
> > +   device_isolation_dev_add(phb->p5ioc2.di_group, &pdev->dev);
> > +#endif
> >  }
> >  
> >  static void __init pnv_pci_init_p5ioc2_phb(struct device_node *np,
> > diff --git a/arch/powerpc/platforms/powernv/pci.h 
> > b/arch/powerpc/platforms/powernv/pci.h
> > index 8bc4796..64ede1e 100644
> > --- a/arch/powerpc/platforms/powernv/pci.h
> > +++ b/arch/powerpc/platforms/powernv/pci.h
> > @@ -87,6 +87,9 @@ struct pnv_phb {
> > union {
> > struct {
> > struct iommu_table iommu_table;
> > +#ifdef CONFIG_DEVICE_ISOLATION
> > +   struct device_isolation_group *di_group;
> > +#endif
> > } p5ioc2;
> >  
> > struct {
> 
> 



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] device_isolation: Support isolation on POWER p5ioc2 bridges

2012-02-01 Thread Alex Williamson
On Wed, 2012-02-01 at 15:46 +1100, David Gibson wrote:
> This patch adds code to the code for the powernv platform to create
> and populate isolation groups on hardware using the p5ioc2 PCI host
> bridge used on some IBM POWER systems.
> 
> Signed-off-by: Alexey Kardashevskiy 
> Signed-off-by: David Gibson 
> ---
>  arch/powerpc/platforms/powernv/pci-p5ioc2.c |   14 +-
>  arch/powerpc/platforms/powernv/pci.h|3 +++
>  2 files changed, 16 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-p5ioc2.c 
> b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
> index 2649677..e5bb3a6 100644
> --- a/arch/powerpc/platforms/powernv/pci-p5ioc2.c
> +++ b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
> @@ -20,6 +20,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -88,10 +89,21 @@ static void pnv_pci_init_p5ioc2_msis(struct pnv_phb *phb) 
> { }
>  static void __devinit pnv_pci_p5ioc2_dma_dev_setup(struct pnv_phb *phb,
>  struct pci_dev *pdev)
>  {
> - if (phb->p5ioc2.iommu_table.it_map == NULL)
> + if (phb->p5ioc2.iommu_table.it_map == NULL) {
>   iommu_init_table(&phb->p5ioc2.iommu_table, phb->hose->node);
> +#ifdef CONFIG_DEVICE_ISOLATION
> + phb->p5ioc2.di_group = kzalloc(sizeof(*(phb->p5ioc2.di_group)),
> +GFP_KERNEL);
> + BUG_ON(!phb->p5ioc2.di_group ||
> +(device_isolation_group_init(phb->p5ioc2.di_group,
> + "p5ioc2:%llx", 
> phb->opal_id) < 0));
> +#endif

Hmm, it's really unfortunate that this is architected so we need to
surround everything in #ifdefs even though we have stub functions
defined.

> + }
>  
>   set_iommu_table_base(&pdev->dev, &phb->p5ioc2.iommu_table);
> +#ifdef CONFIG_DEVICE_ISOLATION
> + device_isolation_dev_add(phb->p5ioc2.di_group, &pdev->dev);
> +#endif
>  }
>  
>  static void __init pnv_pci_init_p5ioc2_phb(struct device_node *np,
> diff --git a/arch/powerpc/platforms/powernv/pci.h 
> b/arch/powerpc/platforms/powernv/pci.h
> index 8bc4796..64ede1e 100644
> --- a/arch/powerpc/platforms/powernv/pci.h
> +++ b/arch/powerpc/platforms/powernv/pci.h
> @@ -87,6 +87,9 @@ struct pnv_phb {
>   union {
>   struct {
>   struct iommu_table iommu_table;
> +#ifdef CONFIG_DEVICE_ISOLATION
> + struct device_isolation_group *di_group;
> +#endif
>   } p5ioc2;
>  
>   struct {



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/3] device_isolation: Support isolation on POWER p5ioc2 bridges

2012-01-31 Thread David Gibson
This patch adds code to the code for the powernv platform to create
and populate isolation groups on hardware using the p5ioc2 PCI host
bridge used on some IBM POWER systems.

Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: David Gibson 
---
 arch/powerpc/platforms/powernv/pci-p5ioc2.c |   14 +-
 arch/powerpc/platforms/powernv/pci.h|3 +++
 2 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-p5ioc2.c 
b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
index 2649677..e5bb3a6 100644
--- a/arch/powerpc/platforms/powernv/pci-p5ioc2.c
+++ b/arch/powerpc/platforms/powernv/pci-p5ioc2.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -88,10 +89,21 @@ static void pnv_pci_init_p5ioc2_msis(struct pnv_phb *phb) { 
}
 static void __devinit pnv_pci_p5ioc2_dma_dev_setup(struct pnv_phb *phb,
   struct pci_dev *pdev)
 {
-   if (phb->p5ioc2.iommu_table.it_map == NULL)
+   if (phb->p5ioc2.iommu_table.it_map == NULL) {
iommu_init_table(&phb->p5ioc2.iommu_table, phb->hose->node);
+#ifdef CONFIG_DEVICE_ISOLATION
+   phb->p5ioc2.di_group = kzalloc(sizeof(*(phb->p5ioc2.di_group)),
+  GFP_KERNEL);
+   BUG_ON(!phb->p5ioc2.di_group ||
+  (device_isolation_group_init(phb->p5ioc2.di_group,
+   "p5ioc2:%llx", 
phb->opal_id) < 0));
+#endif
+   }
 
set_iommu_table_base(&pdev->dev, &phb->p5ioc2.iommu_table);
+#ifdef CONFIG_DEVICE_ISOLATION
+   device_isolation_dev_add(phb->p5ioc2.di_group, &pdev->dev);
+#endif
 }
 
 static void __init pnv_pci_init_p5ioc2_phb(struct device_node *np,
diff --git a/arch/powerpc/platforms/powernv/pci.h 
b/arch/powerpc/platforms/powernv/pci.h
index 8bc4796..64ede1e 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -87,6 +87,9 @@ struct pnv_phb {
union {
struct {
struct iommu_table iommu_table;
+#ifdef CONFIG_DEVICE_ISOLATION
+   struct device_isolation_group *di_group;
+#endif
} p5ioc2;
 
struct {
-- 
1.7.8.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] device_isolation: Support isolation on POWER p7ioc (IODA) bridges

2012-01-31 Thread David Gibson
This patch adds code to the code for the powernv platform to create
and populate isolation groups on hardware using the p7ioc (aka IODA) PCI host
bridge used on some IBM POWER systems.

Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: David Gibson 
---
 arch/powerpc/platforms/powernv/pci-ioda.c |   18 --
 arch/powerpc/platforms/powernv/pci.h  |6 ++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 5e155df..4648475 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -877,6 +878,9 @@ static void __devinit pnv_ioda_setup_bus_dma(struct 
pnv_ioda_pe *pe,
set_iommu_table_base(&dev->dev, &pe->tce32_table);
if (dev->subordinate)
pnv_ioda_setup_bus_dma(pe, dev->subordinate);
+#ifdef CONFIG_DEVICE_ISOLATION
+   device_isolation_dev_add(&pe->di_group, &dev->dev);
+#endif
}
 }
 
@@ -957,11 +961,21 @@ static void __devinit pnv_pci_ioda_setup_dma_pe(struct 
pnv_phb *phb,
}
iommu_init_table(tbl, phb->hose->node);
 
-   if (pe->pdev)
+#ifdef CONFIG_DEVICE_ISOLATION
+   BUG_ON(device_isolation_group_init(&pe->di_group, "ioda:rid%x-pe%x",
+  pe->rid, pe->pe_number) < 0);
+#endif
+
+   if (pe->pdev) {
set_iommu_table_base(&pe->pdev->dev, tbl);
-   else
+#ifdef CONFIG_DEVICE_ISOLATION
+   device_isolation_dev_add(&pe->di_group, &pe->pdev->dev);
+#endif
+   } else
pnv_ioda_setup_bus_dma(pe, pe->pbus);
 
+
+
return;
  fail:
/* XXX Failure: Try to fallback to 64-bit only ? */
diff --git a/arch/powerpc/platforms/powernv/pci.h 
b/arch/powerpc/platforms/powernv/pci.h
index 64ede1e..3e282b7 100644
--- a/arch/powerpc/platforms/powernv/pci.h
+++ b/arch/powerpc/platforms/powernv/pci.h
@@ -1,6 +1,8 @@
 #ifndef __POWERNV_PCI_H
 #define __POWERNV_PCI_H
 
+#include 
+
 struct pci_dn;
 
 enum pnv_phb_type {
@@ -60,6 +62,10 @@ struct pnv_ioda_pe {
 
/* Link in list of PE#s */
struct list_headlink;
+
+#ifdef CONFIG_DEVICE_ISOLATION
+   struct device_isolation_group di_group;
+#endif
 };
 
 struct pnv_phb {
-- 
1.7.8.3

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Multiple TAP Interfaces, with multiple bridges

2010-02-03 Thread J L
On 3 February 2010 17:16,   wrote:
> On Wednesday 03 February 2010 17:56:43 J L wrote:
>> I am having an odd networking issue. It is one of those "it used to
>> work, and now it doesn't" kind of things. I can't work out what I am
>> doing differently.
>>
>> I have a virtual machine, started with (among other things):
>>   -net nic,macaddr=fa:9e:0b:53:d2:7d,model=rtl8139 -net
>> tap,script=/images/1/ifup-eth0,downscript=/images/1/ifdown-eth0
>>   -net nic,macaddr=fa:02:4e:86:ed:ce,model=e1000 -net
>> tap,script=/images/1/ifup-eth1,downscript=/images/1/ifdown-eth1
>
> This seems to be missing a vlan= option at least for the second pair:
>
>> What I actually get:
>>   * VM: eth1, dest MAC of Host's tap1/br0
>>   * Host: tap1, dest MAC of Host's tap1/br0
>>   * Host: br1, dest MAC of Host's tap1/br0
>>   * Host should, but does not route from br0 to br1
>>   * Host: tap0, dest MAC of ***Host's tap1/br0***
>>   * Host: br0, dest MAC of ***Host's tap1/br0**
>>   * Host: eth0, no packet
>>   * Server: eth0, no packet
>>
>> As you can see, the packet has egressed both tap interfaces! Is this
>> expected behaviour? What can I do about this?
>
> Qemu forwards this packet to everything inside of the same vlan, which
> is 0 by default. Does it work with this?
>
>   -net nic,vlan=1,macaddr=fa:9e:0b:53:d2:7d,model=rtl8139 -net 
> tap,vlan=1,script=/images/1/ifup-eth0,downscript=/images/1/ifdown-eth0
>   -net nic,vlan=2,macaddr=fa:02:4e:86:ed:ce,model=e1000 -net 
> tap,vlan=2,script=/images/1/ifup-eth1,downscript=/images/1/ifdown-eth1

Thanks, both to you and Tom, who both emailed this piece of clue at
the same time :)

My misunderstanding was in thinking that vlan=XX would mean the
packets would land on the bridge with that VLAN tag, not what it seems
to actually be doing, of being used to tie one-or-more '-net nic'
sections with one-or-more '-net tap' sections. That is, I though the
vlan=XX was host-wide, not guest-wide.

Don't know how it worked before - probably I just never noticed the
extra packets.


>> If I remove tap0 from the bridge, I then get:
>>   * VM: eth1, dest MAC of Host's tap1/br0
>>   * Host: tap1, dest MAC of Host's tap1/br0
>>   * Host: br1, dest MAC of Host's tap1/br0
>>   * Host should, but does not, route from br0 to br1
>>   * Host: tap0, no packet
>>   * Host: br0, no packet
>>   * Host: eth0, no packet
>>   * Server: eth0, no packet
>>
>> This is the other half of my problem: in this case, with effectively
>> only one tap, the host is not routing between br1 and br0. The packet
>> just gets silently dropped. Does anyone know what I am doing wrong?
>
> Maybe /proc/sys/net/ipv4/ip_forward is disabled?
Sorry, forgot to mention that bit. It is '1'.

I added a '-j LOG' rule to the FORWARD table (as the only rule, policy
ACCEPT), and can see that the packets from the VM never make it to the
FORWARD table.


>
>        Arnd
>


Thanks,
-- 
Jarrod Lowe
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Multiple TAP Interfaces, with multiple bridges

2010-02-03 Thread arnd
On Wednesday 03 February 2010 17:56:43 J L wrote:
> I am having an odd networking issue. It is one of those "it used to
> work, and now it doesn't" kind of things. I can't work out what I am
> doing differently.
> 
> I have a virtual machine, started with (among other things):
>   -net nic,macaddr=fa:9e:0b:53:d2:7d,model=rtl8139 -net
> tap,script=/images/1/ifup-eth0,downscript=/images/1/ifdown-eth0
>   -net nic,macaddr=fa:02:4e:86:ed:ce,model=e1000 -net
> tap,script=/images/1/ifup-eth1,downscript=/images/1/ifdown-eth1

This seems to be missing a vlan= option at least for the second pair:

> What I actually get:
>   * VM: eth1, dest MAC of Host's tap1/br0
>   * Host: tap1, dest MAC of Host's tap1/br0
>   * Host: br1, dest MAC of Host's tap1/br0
>   * Host should, but does not route from br0 to br1
>   * Host: tap0, dest MAC of ***Host's tap1/br0***
>   * Host: br0, dest MAC of ***Host's tap1/br0**
>   * Host: eth0, no packet
>   * Server: eth0, no packet
> 
> As you can see, the packet has egressed both tap interfaces! Is this
> expected behaviour? What can I do about this?

Qemu forwards this packet to everything inside of the same vlan, which
is 0 by default. Does it work with this?

   -net nic,vlan=1,macaddr=fa:9e:0b:53:d2:7d,model=rtl8139 -net 
tap,vlan=1,script=/images/1/ifup-eth0,downscript=/images/1/ifdown-eth0
   -net nic,vlan=2,macaddr=fa:02:4e:86:ed:ce,model=e1000 -net 
tap,vlan=2,script=/images/1/ifup-eth1,downscript=/images/1/ifdown-eth1

> If I remove tap0 from the bridge, I then get:
>   * VM: eth1, dest MAC of Host's tap1/br0
>   * Host: tap1, dest MAC of Host's tap1/br0
>   * Host: br1, dest MAC of Host's tap1/br0
>   * Host should, but does not, route from br0 to br1
>   * Host: tap0, no packet
>   * Host: br0, no packet
>   * Host: eth0, no packet
>   * Server: eth0, no packet
> 
> This is the other half of my problem: in this case, with effectively
> only one tap, the host is not routing between br1 and br0. The packet
> just gets silently dropped. Does anyone know what I am doing wrong?

Maybe /proc/sys/net/ipv4/ip_forward is disabled?

Arnd
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Multiple TAP Interfaces, with multiple bridges

2010-02-03 Thread Tom Lendacky
On Wednesday 03 February 2010 10:56:43 am J L wrote:
> Hi,
> 
> I am having an odd networking issue. It is one of those "it used to
> work, and now it doesn't" kind of things. I can't work out what I am
> doing differently.
> 
> I have a virtual machine, started with (among other things):
>   -net nic,macaddr=fa:9e:0b:53:d2:7d,model=rtl8139 -net
> tap,script=/images/1/ifup-eth0,downscript=/images/1/ifdown-eth0
>   -net nic,macaddr=fa:02:4e:86:ed:ce,model=e1000 -net
> tap,script=/images/1/ifup-eth1,downscript=/images/1/ifdown-eth1
> 

I believe this has to do with the qemu vlan support. If you don't specify the 
vlan= option you end up with nics on the same vlan. You need to assign the two 
nics to separate vlans using vlan= on each net parameter, eg:


   -net nic,vlan=0,macaddr=fa:9e:0b:53:d2:7d,model=rtl8139 -net
 tap,vlan=0,script=/images/1/ifup-eth0,downscript=/images/1/ifdown-eth0
   -net nic,vlan=1,macaddr=fa:02:4e:86:ed:ce,model=e1000 -net
 tap,vlan=1,script=/images/1/ifup-eth1,downscript=/images/1/ifdown-eth1

Try that and see if you get the results you expect.

Tom

> The ifup-ethX script inserts the tap interface into the correct bridge
> (of which there are multiple.)
> 
> The Virtual Machine is Centos 5.3, with a 2.6.27.21 kernel. The Host
> is Ubuntu 9.10 with a 2.6.31 kernel.
> 
> 
> My network then looks like:
> 
> The Virtual Machine has an eth0 interface, which is matched with tap0
> on the host.
> The Virtual Machine has an eth1 interface, which is matched with tap1
> on the host.
> 
> The host has a bridge br0, which contains tap0 and eth0.
> The host has a bridge br1, which contains tap1.
> 
> There is a server on the same network as the Host's eth0.
> 
> The Virtual Machines eth0 interface is down.
> The Virtual Machines eth1 interface has an IP address of 192.168.1.10/24.
> The Virtual Machine has a default gateway of 192.168.1.1.
> 
> The host's br0 has an IP address of 192.168.0.1/24.
> The host's br1 has an IP address of 192.168.1.1/24.
> 
> The server has an IP address of 192.168.0.20/24, and a default gateway
> of 192.168.0.1.
> 
> Firewalling is disabled everywhere. I have allowed time for the
> bridges and STP to settle.
> 
> 
> 
> If I go to the Virtual Machine, and ping 192.168.0.20 (the server), I
> would expect tcpdumps to show:
>   * VM: eth1, dest MAC of Host's tap1/br0
>   * Host: tap1, dest MAC of Host's tap1/br0
>   * Host: br1, dest MAC of Host's tap1/br0
>   * Host now routes from br1 to br0
>   * Host: tap0, no packet
>   * Host: br0, dest MAC of Server
>   * Host: eth0, dest MAC of Server
>   * Server: eth0, dest MAC of Server
> 
> What I actually get:
>   * VM: eth1, dest MAC of Host's tap1/br0
>   * Host: tap1, dest MAC of Host's tap1/br0
>   * Host: br1, dest MAC of Host's tap1/br0
>   * Host should, but does not route from br0 to br1
>   * Host: tap0, dest MAC of ***Host's tap1/br0***
>   * Host: br0, dest MAC of ***Host's tap1/br0**
>   * Host: eth0, no packet
>   * Server: eth0, no packet
> 
> As you can see, the packet has egressed *both* tap interfaces! Is this
> expected behaviour? What can I do about this?
> 
> 
> 
> 
> If I remove tap0 from the bridge, I then get:
>   * VM: eth1, dest MAC of Host's tap1/br0
>   * Host: tap1, dest MAC of Host's tap1/br0
>   * Host: br1, dest MAC of Host's tap1/br0
>   * Host should, but does not, route from br0 to br1
>   * Host: tap0, no packet
>   * Host: br0, no packet
>   * Host: eth0, no packet
>   * Server: eth0, no packet
> 
> This is the other half of my problem: in this case, with effectively
> only one tap, the host is not routing between br1 and br0. The packet
> just gets silently dropped. Does anyone know what I am doing wrong?
> 
> I hope I have managed to explain this well enough!
> 
> Thanks,
> --
> Jarrod Lowe
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Multiple TAP Interfaces, with multiple bridges

2010-02-03 Thread J L
Hi,

I am having an odd networking issue. It is one of those "it used to
work, and now it doesn't" kind of things. I can't work out what I am
doing differently.

I have a virtual machine, started with (among other things):
  -net nic,macaddr=fa:9e:0b:53:d2:7d,model=rtl8139 -net
tap,script=/images/1/ifup-eth0,downscript=/images/1/ifdown-eth0
  -net nic,macaddr=fa:02:4e:86:ed:ce,model=e1000 -net
tap,script=/images/1/ifup-eth1,downscript=/images/1/ifdown-eth1

The ifup-ethX script inserts the tap interface into the correct bridge
(of which there are multiple.)

The Virtual Machine is Centos 5.3, with a 2.6.27.21 kernel. The Host
is Ubuntu 9.10 with a 2.6.31 kernel.


My network then looks like:

The Virtual Machine has an eth0 interface, which is matched with tap0
on the host.
The Virtual Machine has an eth1 interface, which is matched with tap1
on the host.

The host has a bridge br0, which contains tap0 and eth0.
The host has a bridge br1, which contains tap1.

There is a server on the same network as the Host's eth0.

The Virtual Machines eth0 interface is down.
The Virtual Machines eth1 interface has an IP address of 192.168.1.10/24.
The Virtual Machine has a default gateway of 192.168.1.1.

The host's br0 has an IP address of 192.168.0.1/24.
The host's br1 has an IP address of 192.168.1.1/24.

The server has an IP address of 192.168.0.20/24, and a default gateway
of 192.168.0.1.

Firewalling is disabled everywhere. I have allowed time for the
bridges and STP to settle.



If I go to the Virtual Machine, and ping 192.168.0.20 (the server), I
would expect tcpdumps to show:
  * VM: eth1, dest MAC of Host's tap1/br0
  * Host: tap1, dest MAC of Host's tap1/br0
  * Host: br1, dest MAC of Host's tap1/br0
  * Host now routes from br1 to br0
  * Host: tap0, no packet
  * Host: br0, dest MAC of Server
  * Host: eth0, dest MAC of Server
  * Server: eth0, dest MAC of Server

What I actually get:
  * VM: eth1, dest MAC of Host's tap1/br0
  * Host: tap1, dest MAC of Host's tap1/br0
  * Host: br1, dest MAC of Host's tap1/br0
  * Host should, but does not route from br0 to br1
  * Host: tap0, dest MAC of ***Host's tap1/br0***
  * Host: br0, dest MAC of ***Host's tap1/br0**
  * Host: eth0, no packet
  * Server: eth0, no packet

As you can see, the packet has egressed *both* tap interfaces! Is this
expected behaviour? What can I do about this?




If I remove tap0 from the bridge, I then get:
  * VM: eth1, dest MAC of Host's tap1/br0
  * Host: tap1, dest MAC of Host's tap1/br0
  * Host: br1, dest MAC of Host's tap1/br0
  * Host should, but does not, route from br0 to br1
  * Host: tap0, no packet
  * Host: br0, no packet
  * Host: eth0, no packet
  * Server: eth0, no packet

This is the other half of my problem: in this case, with effectively
only one tap, the host is not routing between br1 and br0. The packet
just gets silently dropped. Does anyone know what I am doing wrong?

I hope I have managed to explain this well enough!

Thanks,
--
Jarrod Lowe
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Do I set up separate bridges for each guest?

2009-10-20 Thread Thomas Besser
Neil Aggarwal wrote:
> Dor:
>> The simplest thing is to use a single bridge for all -
>> The physical nic should be part of it and supply the outside world
>> connection. The physical nic doesn't need an IP and the bridge should
>> own it. All vms can use this bridge.
> 
> I want to assign a static IP to each of the guests,
> how would I do that with a single bridge?

Whats the problem? Define the static IP in your guests and it should work.

Regards
Thomas

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Do I set up separate bridges for each guest?

2009-10-20 Thread Neil Aggarwal
Dor:

> The simplest thing is to use a single bridge for all -
> The physical nic should be part of it and supply the outside world 
> connection. The physical nic doesn't need an IP and the bridge should 
> own it. All vms can use this bridge.

I want to assign a static IP to each of the guests,
how would I do that with a single bridge?

Thanks,
Neil

--
Neil Aggarwal, (281)846-8957, www.JAMMConsulting.com
Will your e-commerce site go offline if you have
a DB server failure, fiber cut, flood, fire, or other disaster?
If so, ask about our geographically redundant database system. 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Do I set up separate bridges for each guest?

2009-10-20 Thread Dor Laor

On 10/20/2009 04:37 AM, Neil Aggarwal wrote:

Hello:

I am installing KVM on top of CentOS 5.4 so I can
have two guests running on my host. I would like to
have the host and guests accessible from my
network.

Do I set up separate bridges for each guest or would
they somehow be shared?

If I set up separate bridges, I think I need to do
in /etc/sysconfig/network-scripts on the host machine:

1. Set up ifcfg-eth0 with the ip information of the
host (For example 192.168.2.200)
2. Set up ifcfg-eth0:1 for the first guest.  It will
have BRIDGE=br1
3. Create ifcfg-br1 with the IP info for the first
guest (For example 192.168.2.201)
4. Set up ifcfg-eth0:2 for the second guest.  It will
have BRIDGE=br2
5. Create ifcfg-br2 with the IP info for the second
guest (For example 192.168.2.202)

Is this correct or did I miss something?


The simplest thing is to use a single bridge for all -
The physical nic should be part of it and supply the outside world 
connection. The physical nic doesn't need an IP and the bridge should 
own it. All vms can use this bridge.


cat /etc/sysconfig/network-scripts/ifcfg-br0
DEVICE=br0
TYPE=Bridge
ONBOOT=yes
GATEWAYDEV=''
BOOTPROTO=dhcp
DELAY=0
HWADDR=00:14:5E:17:D0:04
# cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
ONBOOT=yes
BOOTPROTO=none
HWADDR=00:14:5E:17:D0:04
BRIDGE=br0




Thanks,
Neil


--
Neil Aggarwal, (281)846-8957, www.JAMMConsulting.com
Will your e-commerce site go offline if you have
a DB server failure, fiber cut, flood, fire, or other disaster?
If so, ask about our geographically redundant database system.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Do I set up separate bridges for each guest?

2009-10-19 Thread Neil Aggarwal
Hello:

I am installing KVM on top of CentOS 5.4 so I can
have two guests running on my host. I would like to 
have the host and guests accessible from my
network.

Do I set up separate bridges for each guest or would
they somehow be shared?

If I set up separate bridges, I think I need to do
in /etc/sysconfig/network-scripts on the host machine:

1. Set up ifcfg-eth0 with the ip information of the 
host (For example 192.168.2.200)
2. Set up ifcfg-eth0:1 for the first guest.  It will
have BRIDGE=br1
3. Create ifcfg-br1 with the IP info for the first
guest (For example 192.168.2.201)
4. Set up ifcfg-eth0:2 for the second guest.  It will
have BRIDGE=br2
5. Create ifcfg-br2 with the IP info for the second
guest (For example 192.168.2.202)

Is this correct or did I miss something?

Thanks,
Neil


--
Neil Aggarwal, (281)846-8957, www.JAMMConsulting.com
Will your e-commerce site go offline if you have
a DB server failure, fiber cut, flood, fire, or other disaster?
If so, ask about our geographically redundant database system.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bridges problem with kvm (SOLVED)

2009-09-17 Thread carlopmart

carlopmart wrote:

Hi all,

 I have a strange problem with my bridges configuration. I have a 
rhel5.4 host with kvm-83-105 package installed. This host has two 
bridged interfaces defined to use with kvm guests:


DEVICE=prodif
ONBOOT=yes
TYPE=Bridge
IPADDR=172.26.50.14
NETMASK=255.255.255.240
DELAY=0
STP=off

and

DEVICE=iscsif
ONBOOT=yes
TYPE=Bridge
DELAY=0
STP=off

 I have installed two kvm guests (rhel5.4 also) with two virtual 
interfaces using virtio driver on one guest and e1000 driver on the 
other guest.


 My problem is: when I do a ping between these guests over prodif bridge 
all works as expected: ping responds. But if I do another ping over 
iscsif bridge doesn't works. The only difference between prodif bridge 
and iscsif bridge is that prodif has an IP address.


 More info:

 a) brctl show on host:

bridge name bridge id   STP enabled interfaces
iscsif  8000.c201b3289830   no  vnet3
vnet1
prodif  8000.226af089f4c3   no  vnet2
vnet0

 b) sysctl.conf on host:

net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

 I don't have iptables rules defined and net.ipv4.ip_forward is disabled 
(but if I put sysctl -w net.ipv4.ip_forward=1, result is the same).


 What am I doing wrong??

 Many thanks.



I find the problem: mac address used on kvm guests. I have chegend to 
00:XX:XX:XX:XX:XX.


--
CL Martinez
carlopmart {at} gmail {d0t} com
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bridges problem with kvm

2009-09-17 Thread carlopmart

Hi all,

 I have a strange problem with my bridges configuration. I have a 
rhel5.4 host with kvm-83-105 package installed. This host has two 
bridged interfaces defined to use with kvm guests:


DEVICE=prodif
ONBOOT=yes
TYPE=Bridge
IPADDR=172.26.50.14
NETMASK=255.255.255.240
DELAY=0
STP=off

and

DEVICE=iscsif
ONBOOT=yes
TYPE=Bridge
DELAY=0
STP=off

 I have installed two kvm guests (rhel5.4 also) with two virtual 
interfaces using virtio driver on one guest and e1000 driver on the 
other guest.


 My problem is: when I do a ping between these guests over prodif 
bridge all works as expected: ping responds. But if I do another ping 
over iscsif bridge doesn't works. The only difference between prodif 
bridge and iscsif bridge is that prodif has an IP address.


 More info:

 a) brctl show on host:

bridge name bridge id   STP enabled interfaces
iscsif  8000.c201b3289830   no  vnet3
vnet1
prodif  8000.226af089f4c3   no  vnet2
vnet0

 b) sysctl.conf on host:

net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

 I don't have iptables rules defined and net.ipv4.ip_forward is 
disabled (but if I put sysctl -w net.ipv4.ip_forward=1, result is the same).


 What am I doing wrong??

 Many thanks.

--
CL Martinez
carlopmart {at} gmail {d0t} com
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bridges

2009-05-07 Thread Matthew Palmer
On Thu, May 07, 2009 at 08:57:03AM -0700, Ross Boylan wrote:
> I'm trying to understand bridging with KVM, but am still puzzled.
> I think that the recommended bridging with TAP means that packets from
> the VM will end up going out the host card attached to the default
> gateway.  But it looks to me as if their IP address is unchanged, which
> means replies will never reach me.  Is that correct?  Do I need to NAT
> the packets, or is something already doing that?
> 
> Some documents indicate that I need to bring the interfaces (e.g., eth0)
> down before I bring the bridge up, and that afterwards only the bridge
> will have an IP address.  Is that right?

Here's how I think of a Linux "soft" bridge: the bridge consists of an
Ethernet switch, and a regular interface (named after the bridge) that is
connected to that switch.  This is why you "give an IP address to the
bridge", because "the bridge" is also a NIC of it's own.

If you attach any physical interfaces (eg ethN) to the bridge, they aren't
NICs any more, they're just network cables you plug into the switch to pass
traffic to other switches.  Attaching VMs to the switch is just hooking up
more cables between the switch and the VMs.

If you want your host to do NAT for your VMs, then you do as you would for
any other firewall -- you have one switch (the bridge, in this case) with
all of your VMs and the "internal" interface of the host (in this case, the
bridge as well) all plugged in, and then a second interface to the outside
world (the physical NIC).

> Some documents, e.g.,
> http://ebtables.sourceforge.net/br_fw_ia/br_fw_ia.html, indicate
> iptables should "just work" with bridging.

Yes, iptables *does* "just work" with bridging, in the sense that iptables
can still filter IP packets passing through it's interfaces.  What you
*can't* do, though, is have some sort of magic iptables filter deep in the
bridge that plays with all traffic as it traverses.  For that, there's
ebtables, which is iptables but for Ethernet (rather than IP) traffic. 
Personally, I've never used ebtables in my life.

> However, I've seen someone
> with a 2.6.15 kernel ask about firewalling and be told they needed to
> patch the kernel to get it work (don't have the reference handy).
> Should it just work?

It should Just Work, and if you've got to patch any 2.6 (or even probably
2.4) kernel then you're doing something *very* esoteric.

- Matt
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bridges

2009-05-07 Thread Cam Macdonell

Ross Boylan wrote:

On Thu, 2009-05-07 at 11:13 -0600, Cam Macdonell wrote:

Ross Boylan wrote:

I'm trying to understand bridging with KVM, but am still puzzled.
I think that the recommended bridging with TAP means that packets

from

the VM will end up going out the host card attached to the default
gateway.  But it looks to me as if their IP address is unchanged,

which

means replies will never reach me.  Is that correct?  Do I need to

NAT

the packets, or is something already doing that?

Hi Ross,

This is the place to start

http://www.linux-kvm.org/page/Networking.  

I saw that; it gives some recipes but I wasn't sure what their effect
was.


You want a public bridge.

I'm not sure what "their" and "me" mean in your email.  In short,
with 
bridging each VM has its own IP and that VM can be accessed directly 
from the network.

"their" = the VM.
"me" = my host machine.

So if the VM's are running on their own subnet, 


VMs do not run on their own subnet with bridged networking.


e.g., 10.0.2.* (I've
been assuming the subnet with TAP is like the one with the User Mode
Network stack in 3.7.3 of http://www.nongnu.org/qemu/qemu-doc.html) and
my host machine is on another net, e.g., 10.0.8.* then I think the
packet will go out with an IP of 10.0.2.2 (say).  When some other
machine tries to reply to 10.0.2.2, the packet gets lost because the
outside network thinks 10.0.2.* is not for it.  At least that's my
concern.  If the return IP address on the packet were 10.0.8.44
(supposing that's the IP of my host machine) then the packets could find
their way back.


Using bridged networking is very different from the user stack.  The 
user stack is extremely limited and slow.




My host machine is on an internal network with a 10.* IP.  The example
might be clearer if one supposed that the VM's were on a 192.168.*
network.

I am perhaps being influenced by the fact that I don't want to ask for
more IP's, so I don't want to configure the VM's to use an IP on our
10.0.8 network.


Then you probably want to use a NAT network.  A NAT setup puts all the 
VMs on their own network within the host machine.  iptables is necessary 
to forward the subnet packets out to the world and back.


Here is some older documentation, but not much has changed.  Look at the 
first entry under "Advanced Networking".


https://help.ubuntu.com/community/KVMFeisty


Does the TAP networking setup a whole subnet like the user mode network
stack (e.g., running a DHCP server), or is the idea that I would just
give the VM an IP on my subnet (10.0.8.*) in this example?


No, bridge networking using taps (one tap per VM) and effectively sits 
all the VMs on the same network your host is on.  You would need to get 
IPs from sysadmin for each VM.



If the latter is the case (I'm now suspecting it is) then I think the
solution is clear.  I just stick the VM's on a private (to my machine)
subnet, like 192.168.*, and I do NAT on the packets as they go out.


NAT is a very common solution.  Use VDE (vde.sourceforget.net) to create 
a virtual switch on your host for the VMs.  dnsmasq can serve dynamic 
IPs to the VMs on their own subnet that doesn't bother your sysadmin at 
all.  Use iptables to forward and receive packets through your host's 
NIC.


Cam
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: bridges

2009-05-07 Thread Ross Boylan
On Thu, 2009-05-07 at 11:13 -0600, Cam Macdonell wrote:
> 
> Ross Boylan wrote:
> > I'm trying to understand bridging with KVM, but am still puzzled.
> > I think that the recommended bridging with TAP means that packets
> from
> > the VM will end up going out the host card attached to the default
> > gateway.  But it looks to me as if their IP address is unchanged,
> which
> > means replies will never reach me.  Is that correct?  Do I need to
> NAT
> > the packets, or is something already doing that?
> 
> Hi Ross,
> 
> This is the place to start
> 
> http://www.linux-kvm.org/page/Networking.  
I saw that; it gives some recipes but I wasn't sure what their effect
was.

> You want a public bridge.
> 
> I'm not sure what "their" and "me" mean in your email.  In short,
> with 
> bridging each VM has its own IP and that VM can be accessed directly 
> from the network.
"their" = the VM.
"me" = my host machine.

So if the VM's are running on their own subnet, e.g., 10.0.2.* (I've
been assuming the subnet with TAP is like the one with the User Mode
Network stack in 3.7.3 of http://www.nongnu.org/qemu/qemu-doc.html) and
my host machine is on another net, e.g., 10.0.8.* then I think the
packet will go out with an IP of 10.0.2.2 (say).  When some other
machine tries to reply to 10.0.2.2, the packet gets lost because the
outside network thinks 10.0.2.* is not for it.  At least that's my
concern.  If the return IP address on the packet were 10.0.8.44
(supposing that's the IP of my host machine) then the packets could find
their way back.

My host machine is on an internal network with a 10.* IP.  The example
might be clearer if one supposed that the VM's were on a 192.168.*
network.

I am perhaps being influenced by the fact that I don't want to ask for
more IP's, so I don't want to configure the VM's to use an IP on our
10.0.8 network.

Does the TAP networking setup a whole subnet like the user mode network
stack (e.g., running a DHCP server), or is the idea that I would just
give the VM an IP on my subnet (10.0.8.*) in this example?

If the latter is the case (I'm now suspecting it is) then I think the
solution is clear.  I just stick the VM's on a private (to my machine)
subnet, like 192.168.*, and I do NAT on the packets as they go out.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


bridges

2009-05-07 Thread Ross Boylan
I'm trying to understand bridging with KVM, but am still puzzled.
I think that the recommended bridging with TAP means that packets from
the VM will end up going out the host card attached to the default
gateway.  But it looks to me as if their IP address is unchanged, which
means replies will never reach me.  Is that correct?  Do I need to NAT
the packets, or is something already doing that?

Some documents indicate that I need to bring the interfaces (e.g., eth0)
down before I bring the bridge up, and that afterwards only the bridge
will have an IP address.  Is that right?

Some documents, e.g.,
http://ebtables.sourceforge.net/br_fw_ia/br_fw_ia.html, indicate
iptables should "just work" with bridging.  However, I've seen someone
with a 2.6.15 kernel ask about firewalling and be told they needed to
patch the kernel to get it work (don't have the reference handy).
Should it just work?

I'm running a 2.6.29 kernel on Debian Lenny with kvm 72+dfsg-5~lenny1.
Version 84+dfsg-2 is available in experimental.  Is there much to be
gained by going with the more recent version?

Please cc me; I'm not on the list.

Thanks.
Ross Boylan


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html