Re: 3.6-rc7 boot crash + bisection

2012-09-28 Thread Roedel, Joerg
On Wed, Sep 26, 2012 at 04:04:03PM -0600, Alex Williamson wrote:

> Here's a lockdep clean version of it:
> 
> amd_iommu: Handle aliases not backed by devices
> 
> Aliases sometimes don't have a struct pci_dev backing them.  This breaks
> our attempt to figure out the topology and device quirks that may effect
> IOMMU grouping.  When this happens, allocate an IOMMU group on the
> dev_data for the alias and make use of it for all devices referencing
> this alias.

Yes, this is the real fix. But it is too big for v3.6 at this time, so
I'll would take this for 3.7 and use my small fix for 3.6.

> Signed-off-by: Alex Williamson 
> ---
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index b64502d..4eacb17 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -71,6 +71,7 @@ static DEFINE_SPINLOCK(iommu_pd_list_lock);
>  /* List of all available dev_data structures */
>  static LIST_HEAD(dev_data_list);
>  static DEFINE_SPINLOCK(dev_data_list_lock);
> +static DEFINE_MUTEX(dev_data_iommu_group_lock);

I think this lock is not necessary. The iommu_init_device routine does
not run multiple times in parallel for the same device. So we should be
safe on that side.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-28 Thread Roedel, Joerg
On Wed, Sep 26, 2012 at 04:04:03PM -0600, Alex Williamson wrote:

 Here's a lockdep clean version of it:
 
 amd_iommu: Handle aliases not backed by devices
 
 Aliases sometimes don't have a struct pci_dev backing them.  This breaks
 our attempt to figure out the topology and device quirks that may effect
 IOMMU grouping.  When this happens, allocate an IOMMU group on the
 dev_data for the alias and make use of it for all devices referencing
 this alias.

Yes, this is the real fix. But it is too big for v3.6 at this time, so
I'll would take this for 3.7 and use my small fix for 3.6.

 Signed-off-by: Alex Williamson alex.william...@redhat.com
 ---
 
 diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
 index b64502d..4eacb17 100644
 --- a/drivers/iommu/amd_iommu.c
 +++ b/drivers/iommu/amd_iommu.c
 @@ -71,6 +71,7 @@ static DEFINE_SPINLOCK(iommu_pd_list_lock);
  /* List of all available dev_data structures */
  static LIST_HEAD(dev_data_list);
  static DEFINE_SPINLOCK(dev_data_list_lock);
 +static DEFINE_MUTEX(dev_data_iommu_group_lock);

I think this lock is not necessary. The iommu_init_device routine does
not run multiple times in parallel for the same device. So we should be
safe on that side.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-27 Thread Florian Dazinger
Am Wed, 26 Sep 2012 16:04:03 -0600
schrieb Alex Williamson :

> On Wed, 2012-09-26 at 13:50 -0600, Alex Williamson wrote:
> > On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote:
> > > On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
> > > > On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
> > > > > Hmm, that throws a kink in iommu groups.  So perhaps we need to make 
> > > > > an
> > > > > alias interface to iommu groups.  Seems like this could just be an 
> > > > > extra
> > > > > parameter to iommu_group_get and iommu_group_add_device (empty in the
> > > > > typical case).  Then we have the problem of what's the type for an
> > > > > alias?  For AMI-Vi, it's a u16, but we need to be more generic than
> > > > > that.  Maybe iommu groups should just treat it as a void* so iommus 
> > > > > can
> > > > > use a pointer to some structure or a fixed value like a u16 bus:slot.
> > > > > Thoughts?
> > > > 
> > > > Good question. The iommu-groups are part of the IOMMU-API, with an
> > > > interface to the IOMMU drivers and one to the users of IOMMU-API. So the
> > > > alias handling itself should be a function of the interface to the IOMMU
> > > > driver. In general the interface should not be bus specific.
> > > > 
> > > > So a void pointer seems the only logical choice then. But I would not
> > > > limit its scope to alias handling. How about making it a bus-private
> > > > pointer where IOMMU driver store bus-specific information. That way we
> > > > make sure that there is one struct per bus-type for this pointer, and
> > > > not one structure per IOMMU driver.
> > > 
> > > I thought of another approach that may actually be more 3.6 worthy.
> > > What if we just make the iommu driver handle it?  For instance,
> > > amd_iommu can walk the alias table looking for entries that use the same
> > > alias and get the device via pci_get_bus_and_slot.  If it finds a device
> > > with an iommu group, it attaches the new device to the same group,
> > > hiding anything about aliases from the group layer.  It just groups all
> > > devices within the range.  I think the only complication is making sure
> > > we're safe around device hotplug while we're doing this.  Thanks,
> > 
> > I think this could work.  Instead of searching for other devices, check
> > for or allocate an iommu group on the alias dev_data, any "virtual"
> > aliases use that iommu group.  Florian, could you test this as well?
> 
> Here's a lockdep clean version of it:
> 
> amd_iommu: Handle aliases not backed by devices
> 
[ skipped patch ]

yes, this patch is working for me, too. I also tested your second patch, it was 
working as well.
thanks, Florian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-27 Thread Florian Dazinger
Am Wed, 26 Sep 2012 16:04:03 -0600
schrieb Alex Williamson alex.william...@redhat.com:

 On Wed, 2012-09-26 at 13:50 -0600, Alex Williamson wrote:
  On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote:
   On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
 Hmm, that throws a kink in iommu groups.  So perhaps we need to make 
 an
 alias interface to iommu groups.  Seems like this could just be an 
 extra
 parameter to iommu_group_get and iommu_group_add_device (empty in the
 typical case).  Then we have the problem of what's the type for an
 alias?  For AMI-Vi, it's a u16, but we need to be more generic than
 that.  Maybe iommu groups should just treat it as a void* so iommus 
 can
 use a pointer to some structure or a fixed value like a u16 bus:slot.
 Thoughts?

Good question. The iommu-groups are part of the IOMMU-API, with an
interface to the IOMMU drivers and one to the users of IOMMU-API. So the
alias handling itself should be a function of the interface to the IOMMU
driver. In general the interface should not be bus specific.

So a void pointer seems the only logical choice then. But I would not
limit its scope to alias handling. How about making it a bus-private
pointer where IOMMU driver store bus-specific information. That way we
make sure that there is one struct per bus-type for this pointer, and
not one structure per IOMMU driver.
   
   I thought of another approach that may actually be more 3.6 worthy.
   What if we just make the iommu driver handle it?  For instance,
   amd_iommu can walk the alias table looking for entries that use the same
   alias and get the device via pci_get_bus_and_slot.  If it finds a device
   with an iommu group, it attaches the new device to the same group,
   hiding anything about aliases from the group layer.  It just groups all
   devices within the range.  I think the only complication is making sure
   we're safe around device hotplug while we're doing this.  Thanks,
  
  I think this could work.  Instead of searching for other devices, check
  for or allocate an iommu group on the alias dev_data, any virtual
  aliases use that iommu group.  Florian, could you test this as well?
 
 Here's a lockdep clean version of it:
 
 amd_iommu: Handle aliases not backed by devices
 
[ skipped patch ]

yes, this patch is working for me, too. I also tested your second patch, it was 
working as well.
thanks, Florian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 13:50 -0600, Alex Williamson wrote:
> On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote:
> > On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
> > > On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
> > > > Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
> > > > alias interface to iommu groups.  Seems like this could just be an extra
> > > > parameter to iommu_group_get and iommu_group_add_device (empty in the
> > > > typical case).  Then we have the problem of what's the type for an
> > > > alias?  For AMI-Vi, it's a u16, but we need to be more generic than
> > > > that.  Maybe iommu groups should just treat it as a void* so iommus can
> > > > use a pointer to some structure or a fixed value like a u16 bus:slot.
> > > > Thoughts?
> > > 
> > > Good question. The iommu-groups are part of the IOMMU-API, with an
> > > interface to the IOMMU drivers and one to the users of IOMMU-API. So the
> > > alias handling itself should be a function of the interface to the IOMMU
> > > driver. In general the interface should not be bus specific.
> > > 
> > > So a void pointer seems the only logical choice then. But I would not
> > > limit its scope to alias handling. How about making it a bus-private
> > > pointer where IOMMU driver store bus-specific information. That way we
> > > make sure that there is one struct per bus-type for this pointer, and
> > > not one structure per IOMMU driver.
> > 
> > I thought of another approach that may actually be more 3.6 worthy.
> > What if we just make the iommu driver handle it?  For instance,
> > amd_iommu can walk the alias table looking for entries that use the same
> > alias and get the device via pci_get_bus_and_slot.  If it finds a device
> > with an iommu group, it attaches the new device to the same group,
> > hiding anything about aliases from the group layer.  It just groups all
> > devices within the range.  I think the only complication is making sure
> > we're safe around device hotplug while we're doing this.  Thanks,
> 
> I think this could work.  Instead of searching for other devices, check
> for or allocate an iommu group on the alias dev_data, any "virtual"
> aliases use that iommu group.  Florian, could you test this as well?

Here's a lockdep clean version of it:

amd_iommu: Handle aliases not backed by devices

Aliases sometimes don't have a struct pci_dev backing them.  This breaks
our attempt to figure out the topology and device quirks that may effect
IOMMU grouping.  When this happens, allocate an IOMMU group on the
dev_data for the alias and make use of it for all devices referencing
this alias.

Signed-off-by: Alex Williamson 
---

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b64502d..4eacb17 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -71,6 +71,7 @@ static DEFINE_SPINLOCK(iommu_pd_list_lock);
 /* List of all available dev_data structures */
 static LIST_HEAD(dev_data_list);
 static DEFINE_SPINLOCK(dev_data_list_lock);
+static DEFINE_MUTEX(dev_data_iommu_group_lock);
 
 /*
  * Domain for untranslated devices - only allocated
@@ -128,6 +129,9 @@ static void free_dev_data(struct iommu_dev_data *dev_data)
list_del(_data->dev_data_list);
spin_unlock_irqrestore(_data_list_lock, flags);
 
+   if (dev_data->group)
+   iommu_group_put(dev_data->group);
+
kfree(dev_data);
 }
 
@@ -256,6 +260,34 @@ static bool check_device(struct device *dev)
return true;
 }
 
+/*
+ * Sometimes there's no actual device for an alias.  When that happens
+ * we allocate an iommu group on the dev_data and use it for anything
+ * aliasing back to this device.  This makes sure that multiple devices
+ * aliased to a non-existent device id all get grouped together.  Hold
+ * on to the reference for the group, it can be static rather than get
+ * automatically reclaimed if this device later gets removed.
+ */
+static int dev_data_add_iommu_group(struct iommu_dev_data *dev_data,
+   struct device *dev)
+{
+   mutex_lock(_data_iommu_group_lock);
+
+   if (!dev_data->group) {
+   struct iommu_group *group = iommu_group_alloc();
+   if (IS_ERR(group)) {
+   mutex_unlock(_data_iommu_group_lock);
+   return PTR_ERR(group);
+   }
+
+   dev_data->group = group;
+   }
+
+   mutex_unlock(_data_iommu_group_lock);
+
+   return iommu_group_add_device(dev_data->group, dev);
+}
+
 static void swap_pci_ref(struct pci_dev **from, struct pci_dev *to)
 {
pci_dev_put(*from);
@@ -264,38 +296,17 @@ static void swap_pci_ref(struct pci_dev **from, struct 
pci_dev *to)
 
 #define REQ_ACS_FLAGS  (PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF)
 
-static int iommu_init_device(struct device *dev)
+/*
+ * Given a pci device, look at device quirks and topology between it
+ * and the IOMMU to 

Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote:
> On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
> > On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
> > > Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
> > > alias interface to iommu groups.  Seems like this could just be an extra
> > > parameter to iommu_group_get and iommu_group_add_device (empty in the
> > > typical case).  Then we have the problem of what's the type for an
> > > alias?  For AMI-Vi, it's a u16, but we need to be more generic than
> > > that.  Maybe iommu groups should just treat it as a void* so iommus can
> > > use a pointer to some structure or a fixed value like a u16 bus:slot.
> > > Thoughts?
> > 
> > Good question. The iommu-groups are part of the IOMMU-API, with an
> > interface to the IOMMU drivers and one to the users of IOMMU-API. So the
> > alias handling itself should be a function of the interface to the IOMMU
> > driver. In general the interface should not be bus specific.
> > 
> > So a void pointer seems the only logical choice then. But I would not
> > limit its scope to alias handling. How about making it a bus-private
> > pointer where IOMMU driver store bus-specific information. That way we
> > make sure that there is one struct per bus-type for this pointer, and
> > not one structure per IOMMU driver.
> 
> I thought of another approach that may actually be more 3.6 worthy.
> What if we just make the iommu driver handle it?  For instance,
> amd_iommu can walk the alias table looking for entries that use the same
> alias and get the device via pci_get_bus_and_slot.  If it finds a device
> with an iommu group, it attaches the new device to the same group,
> hiding anything about aliases from the group layer.  It just groups all
> devices within the range.  I think the only complication is making sure
> we're safe around device hotplug while we're doing this.  Thanks,

I think this could work.  Instead of searching for other devices, check
for or allocate an iommu group on the alias dev_data, any "virtual"
aliases use that iommu group.  Florian, could you test this as well?
Thanks,

Alex

Signed-off-by: Alex Williamson 
---

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b64502d..22879ed 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -126,6 +126,8 @@ static void free_dev_data(struct iommu_dev_data *dev_data)
 
spin_lock_irqsave(_data_list_lock, flags);
list_del(_data->dev_data_list);
+   if (dev_data->group)
+   iommu_group_put(dev_data->group);
spin_unlock_irqrestore(_data_list_lock, flags);
 
kfree(dev_data);
@@ -256,6 +258,37 @@ static bool check_device(struct device *dev)
return true;
 }
 
+/*
+ * Sometimes there's no actual device for an alias.  When that happens
+ * we allocate an iommu group on the iommu_dev_data so that it gets used
+ * by anything with the same alias.  We keep the reference from
+ * iommu_group_alloc so the group persists with the iommu_dev_data.
+ */
+static int dev_data_add_iommu_group(struct iommu_dev_data *dev_data,
+   struct device *dev)
+{
+   unsigned long flags;
+   struct iommu_group *group;
+   int ret = 0;
+
+   spin_lock_irqsave(_data_list_lock, flags);
+   if (!dev_data->group) {
+   group = iommu_group_alloc();
+   if (IS_ERR(group)) {
+   ret = PTR_ERR(group);
+   goto unlock;
+   }
+
+   dev_data->group = group;
+   } else
+   group = dev_data->group;
+
+   ret = iommu_group_add_device(group, dev);
+unlock:
+   spin_unlock_irqrestore(_data_list_lock, flags);
+   return ret;
+}
+
 static void swap_pci_ref(struct pci_dev **from, struct pci_dev *to)
 {
pci_dev_put(*from);
@@ -264,38 +297,12 @@ static void swap_pci_ref(struct pci_dev **from, struct 
pci_dev *to)
 
 #define REQ_ACS_FLAGS  (PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF)
 
-static int iommu_init_device(struct device *dev)
+static int pdev_add_iommu_group(struct pci_dev *pdev, struct device *dev)
 {
-   struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev);
-   struct iommu_dev_data *dev_data;
+   struct pci_dev *dma_pdev = pdev;
struct iommu_group *group;
-   u16 alias;
int ret;
 
-   if (dev->archdata.iommu)
-   return 0;
-
-   dev_data = find_dev_data(get_device_id(dev));
-   if (!dev_data)
-   return -ENOMEM;
-
-   alias = amd_iommu_alias_table[dev_data->devid];
-   if (alias != dev_data->devid) {
-   struct iommu_dev_data *alias_data;
-
-   alias_data = find_dev_data(alias);
-   if (alias_data == NULL) {
-   pr_err("AMD-Vi: Warning: Unhandled device %s\n",
-   dev_name(dev));
-   

Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Florian Dazinger
Am Wed, 26 Sep 2012 17:04:07 +0200
schrieb "Roedel, Joerg" :

> On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
> > Assuming this works, it may be ok as a 3.7 fix, but if there was
> > actually more than one device behind the alias we'd expose them as
> > separate iommu groups.  I don't think that's what we want.  Maybe it
> > should at least get a pr_warn.  Thanks,
> 
> True, we need something more generic as the real fix. When Florian
> reports success I'll try to get this still into 3.6, otherwise to
> -stable.
> 
> 
>   Joerg
> 

 ... updating to the newest BIOS revision does not make any difference, rc7 is 
crashing, rc7+patch is not, I definitely need this patch.
If there is more you want me to test, pls tell.

Florian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Florian Dazinger
Am Wed, 26 Sep 2012 17:04:07 +0200
schrieb "Roedel, Joerg" :

> On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
> > Assuming this works, it may be ok as a 3.7 fix, but if there was
> > actually more than one device behind the alias we'd expose them as
> > separate iommu groups.  I don't think that's what we want.  Maybe it
> > should at least get a pr_warn.  Thanks,
> 
> True, we need something more generic as the real fix. When Florian
> reports success I'll try to get this still into 3.6, otherwise to
> -stable.
> 
> 
>   Joerg
> 

it still fails with the card in a *different* slot. 
But with the patch applied, everything works, so this fixes it for me! Thanks a 
lot. Output of the relevant parts of dmesg and lspci see below. I'll still try 
the newest BIOS rev. and report back.
thx, Florian


dmesg (kernel-3.5.4):

[0.448252] RPC: Registered tcp NFSv4.1 backchannel transport module.
[1.471021] pci :01:00.0: Boot video device
[1.471118] PCI: CLS 64 bytes, default 64
[1.473864] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
[1.473902] AMD-Vi:mmio-addr: feb2
[1.474119] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
[1.474153] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
[1.474187] AMD-Vi:   DEV_SELECT  devid: 00:02.0 flags: 
00
[1.474220] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
[1.474254] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
[1.474287] AMD-Vi:   DEV_SELECT  devid: 00:04.0 flags: 
00
[1.474321] AMD-Vi:   DEV_SELECT  devid: 02:00.0 flags: 
00
[1.474354] AMD-Vi:   DEV_SELECT  devid: 00:05.0 flags: 
00
[1.474388] AMD-Vi:   DEV_SELECT  devid: 03:00.0 flags: 
00
[1.474421] AMD-Vi:   DEV_SELECT  devid: 00:06.0 flags: 
00
[1.474455] AMD-Vi:   DEV_SELECT  devid: 04:00.0 flags: 
00
[1.474488] AMD-Vi:   DEV_SELECT  devid: 00:07.0 flags: 
00
[1.474522] AMD-Vi:   DEV_SELECT  devid: 05:00.0 flags: 
00
[1.474555] AMD-Vi:   DEV_SELECT  devid: 00:09.0 flags: 
00
[1.474589] AMD-Vi:   DEV_SELECT  devid: 06:00.0 flags: 
00
[1.474622] AMD-Vi:   DEV_SELECT  devid: 00:0d.0 flags: 
00
[1.474656] AMD-Vi:   DEV_SELECT  devid: 07:00.0 flags: 
00
[1.474689] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 flags: 
00 devid_to: 08:00.0
[1.474726] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
[1.474764] AMD-Vi:   DEV_SELECT  devid: 00:11.0 flags: 
00
[1.474798] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
[1.474836] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
[1.474870] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
[1.474903] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
[1.474937] AMD-Vi:   DEV_SELECT  devid: 00:14.0 flags: 
d7
[1.474970] AMD-Vi:   DEV_SELECT  devid: 00:14.3 flags: 
00
[1.475004] AMD-Vi:   DEV_SELECT  devid: 00:14.4 flags: 
00
[1.475038] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 flags: 
00 devid_to: 00:14.4
[1.475074] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
[1.475112] AMD-Vi:   DEV_SELECT  devid: 00:14.5 flags: 
00
[1.475146] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
[1.475180] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
[1.475271] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
[1.529007] pci :00:00.2: irq 72 for MSI/MSI-X
[1.539126] AMD-Vi: Lazy IO/TLB flushing enabled
[1.539750] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[1.539787] software IO TLB [mem 0xc9728000-0xcd727fff] (64MB) mapped at 
[8800c9728000-8800cd727fff]
[1.539957] kvm: Nested Virtualization enabled

lspci (kernel-3.5.4):
00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
bridge (external gfx0 port B) (rev 02)
Subsystem: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge 
(external gfx0 port B)
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- 
Capabilities: [54] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: fee0f00c  Data: 41a9
Capabilities: [64] HyperTransport: MSI Mapping Enable+ Fixed+

00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
bridge (PCI express gpp port B) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- 

Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
> On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
> > Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
> > alias interface to iommu groups.  Seems like this could just be an extra
> > parameter to iommu_group_get and iommu_group_add_device (empty in the
> > typical case).  Then we have the problem of what's the type for an
> > alias?  For AMI-Vi, it's a u16, but we need to be more generic than
> > that.  Maybe iommu groups should just treat it as a void* so iommus can
> > use a pointer to some structure or a fixed value like a u16 bus:slot.
> > Thoughts?
> 
> Good question. The iommu-groups are part of the IOMMU-API, with an
> interface to the IOMMU drivers and one to the users of IOMMU-API. So the
> alias handling itself should be a function of the interface to the IOMMU
> driver. In general the interface should not be bus specific.
> 
> So a void pointer seems the only logical choice then. But I would not
> limit its scope to alias handling. How about making it a bus-private
> pointer where IOMMU driver store bus-specific information. That way we
> make sure that there is one struct per bus-type for this pointer, and
> not one structure per IOMMU driver.

I thought of another approach that may actually be more 3.6 worthy.
What if we just make the iommu driver handle it?  For instance,
amd_iommu can walk the alias table looking for entries that use the same
alias and get the device via pci_get_bus_and_slot.  If it finds a device
with an iommu group, it attaches the new device to the same group,
hiding anything about aliases from the group layer.  It just groups all
devices within the range.  I think the only complication is making sure
we're safe around device hotplug while we're doing this.  Thanks,

Alex


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 17:04 +0200, Roedel, Joerg wrote:
> On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
> > Assuming this works, it may be ok as a 3.7 fix, but if there was
> > actually more than one device behind the alias we'd expose them as
> > separate iommu groups.  I don't think that's what we want.  Maybe it
> > should at least get a pr_warn.  Thanks,
> 
> True, we need something more generic as the real fix. When Florian
> reports success I'll try to get this still into 3.6, otherwise to
> -stable.

Yes, 3.6 is what I meant to type.  Thanks,

Alex


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
> Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
> alias interface to iommu groups.  Seems like this could just be an extra
> parameter to iommu_group_get and iommu_group_add_device (empty in the
> typical case).  Then we have the problem of what's the type for an
> alias?  For AMI-Vi, it's a u16, but we need to be more generic than
> that.  Maybe iommu groups should just treat it as a void* so iommus can
> use a pointer to some structure or a fixed value like a u16 bus:slot.
> Thoughts?

Good question. The iommu-groups are part of the IOMMU-API, with an
interface to the IOMMU drivers and one to the users of IOMMU-API. So the
alias handling itself should be a function of the interface to the IOMMU
driver. In general the interface should not be bus specific.

So a void pointer seems the only logical choice then. But I would not
limit its scope to alias handling. How about making it a bus-private
pointer where IOMMU driver store bus-specific information. That way we
make sure that there is one struct per bus-type for this pointer, and
not one structure per IOMMU driver.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
> Assuming this works, it may be ok as a 3.7 fix, but if there was
> actually more than one device behind the alias we'd expose them as
> separate iommu groups.  I don't think that's what we want.  Maybe it
> should at least get a pr_warn.  Thanks,

True, we need something more generic as the real fix. When Florian
reports success I'll try to get this still into 3.6, otherwise to
-stable.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 16:43 +0200, Roedel, Joerg wrote:
> Florian,
> 
> On Wed, Sep 26, 2012 at 01:01:54AM +0200, Florian Dazinger wrote:
> > you're right, either "amd_iommu=off" or removing the audio card makes
> > the failure disappear. I will test the new BIOS rev. tomorrow.
> 
> Can you please test this diff and report if it fixes the problem for
> you?
> 
> Thanks.
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index b64502d..e89daf1 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -266,7 +266,7 @@ static void swap_pci_ref(struct pci_dev **from, struct 
> pci_dev *to)
>  
>  static int iommu_init_device(struct device *dev)
>  {
> - struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev);
> + struct pci_dev *dma_pdev = NULL, *pdev = to_pci_dev(dev);
>   struct iommu_dev_data *dev_data;
>   struct iommu_group *group;
>   u16 alias;
> @@ -293,7 +293,9 @@ static int iommu_init_device(struct device *dev)
>   dev_data->alias_data = alias_data;
>  
>   dma_pdev = pci_get_bus_and_slot(alias >> 8, alias & 0xff);
> - } else
> + }
> +
> + if (dma_pdev == NULL)
>   dma_pdev = pci_dev_get(pdev);
>  
>   /* Account for quirked devices */
> 

Assuming this works, it may be ok as a 3.7 fix, but if there was
actually more than one device behind the alias we'd expose them as
separate iommu groups.  I don't think that's what we want.  Maybe it
should at least get a pr_warn.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
Florian,

On Wed, Sep 26, 2012 at 01:01:54AM +0200, Florian Dazinger wrote:
> you're right, either "amd_iommu=off" or removing the audio card makes
> the failure disappear. I will test the new BIOS rev. tomorrow.

Can you please test this diff and report if it fixes the problem for
you?

Thanks.

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b64502d..e89daf1 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -266,7 +266,7 @@ static void swap_pci_ref(struct pci_dev **from, struct 
pci_dev *to)
 
 static int iommu_init_device(struct device *dev)
 {
-   struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev);
+   struct pci_dev *dma_pdev = NULL, *pdev = to_pci_dev(dev);
struct iommu_dev_data *dev_data;
struct iommu_group *group;
u16 alias;
@@ -293,7 +293,9 @@ static int iommu_init_device(struct device *dev)
dev_data->alias_data = alias_data;
 
dma_pdev = pci_get_bus_and_slot(alias >> 8, alias & 0xff);
-   } else
+   }
+
+   if (dma_pdev == NULL)
dma_pdev = pci_dev_get(pdev);
 
/* Account for quirked devices */

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 15:20 +0200, Roedel, Joerg wrote:
> On Tue, Sep 25, 2012 at 01:43:46PM -0600, Alex Williamson wrote:
> > Joerg, any thoughts on a quirk for this?  Unfortunately we can't just
> > skip IOMMU groups when an alias is broken because it puts the other
> > IOMMU groups at risk that might not actually be isolated from this
> > device.  It looks like we parse the alias info before PCI is probed, so
> > maybe we'd need to call the quirk from iommu_init_device itself.
> 
> I fear that the BIOS does everything right and device 08:04.0 is indeed
> using 08:00.0 as request-id. There are a couple of devices where this
> happens, usually when the vendor just took the old 32bit PCI chip, added
> a transparent PCIe-to-PCI bridge to the device and sell it a PCIe.
> 
> So the assumption that every request-id has a corresponding pci_dev
> structure does not hold. I also had made that assumption in the
> AMD IOMMU driver but had to add code which removes that assumption. We
> should look for a way to remove that assumption from the group-code too.

Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
alias interface to iommu groups.  Seems like this could just be an extra
parameter to iommu_group_get and iommu_group_add_device (empty in the
typical case).  Then we have the problem of what's the type for an
alias?  For AMI-Vi, it's a u16, but we need to be more generic than
that.  Maybe iommu groups should just treat it as a void* so iommus can
use a pointer to some structure or a fixed value like a u16 bus:slot.
Thoughts?  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
On Tue, Sep 25, 2012 at 01:43:46PM -0600, Alex Williamson wrote:
> Joerg, any thoughts on a quirk for this?  Unfortunately we can't just
> skip IOMMU groups when an alias is broken because it puts the other
> IOMMU groups at risk that might not actually be isolated from this
> device.  It looks like we parse the alias info before PCI is probed, so
> maybe we'd need to call the quirk from iommu_init_device itself.

I fear that the BIOS does everything right and device 08:04.0 is indeed
using 08:00.0 as request-id. There are a couple of devices where this
happens, usually when the vendor just took the old 32bit PCI chip, added
a transparent PCIe-to-PCI bridge to the device and sell it a PCIe.

So the assumption that every request-id has a corresponding pci_dev
structure does not hold. I also had made that assumption in the
AMD IOMMU driver but had to add code which removes that assumption. We
should look for a way to remove that assumption from the group-code too.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
On Tue, Sep 25, 2012 at 01:43:46PM -0600, Alex Williamson wrote:
 Joerg, any thoughts on a quirk for this?  Unfortunately we can't just
 skip IOMMU groups when an alias is broken because it puts the other
 IOMMU groups at risk that might not actually be isolated from this
 device.  It looks like we parse the alias info before PCI is probed, so
 maybe we'd need to call the quirk from iommu_init_device itself.

I fear that the BIOS does everything right and device 08:04.0 is indeed
using 08:00.0 as request-id. There are a couple of devices where this
happens, usually when the vendor just took the old 32bit PCI chip, added
a transparent PCIe-to-PCI bridge to the device and sell it a PCIe.

So the assumption that every request-id has a corresponding pci_dev
structure does not hold. I also had made that assumption in the
AMD IOMMU driver but had to add code which removes that assumption. We
should look for a way to remove that assumption from the group-code too.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 15:20 +0200, Roedel, Joerg wrote:
 On Tue, Sep 25, 2012 at 01:43:46PM -0600, Alex Williamson wrote:
  Joerg, any thoughts on a quirk for this?  Unfortunately we can't just
  skip IOMMU groups when an alias is broken because it puts the other
  IOMMU groups at risk that might not actually be isolated from this
  device.  It looks like we parse the alias info before PCI is probed, so
  maybe we'd need to call the quirk from iommu_init_device itself.
 
 I fear that the BIOS does everything right and device 08:04.0 is indeed
 using 08:00.0 as request-id. There are a couple of devices where this
 happens, usually when the vendor just took the old 32bit PCI chip, added
 a transparent PCIe-to-PCI bridge to the device and sell it a PCIe.
 
 So the assumption that every request-id has a corresponding pci_dev
 structure does not hold. I also had made that assumption in the
 AMD IOMMU driver but had to add code which removes that assumption. We
 should look for a way to remove that assumption from the group-code too.

Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
alias interface to iommu groups.  Seems like this could just be an extra
parameter to iommu_group_get and iommu_group_add_device (empty in the
typical case).  Then we have the problem of what's the type for an
alias?  For AMI-Vi, it's a u16, but we need to be more generic than
that.  Maybe iommu groups should just treat it as a void* so iommus can
use a pointer to some structure or a fixed value like a u16 bus:slot.
Thoughts?  Thanks,

Alex

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
Florian,

On Wed, Sep 26, 2012 at 01:01:54AM +0200, Florian Dazinger wrote:
 you're right, either amd_iommu=off or removing the audio card makes
 the failure disappear. I will test the new BIOS rev. tomorrow.

Can you please test this diff and report if it fixes the problem for
you?

Thanks.

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b64502d..e89daf1 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -266,7 +266,7 @@ static void swap_pci_ref(struct pci_dev **from, struct 
pci_dev *to)
 
 static int iommu_init_device(struct device *dev)
 {
-   struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev);
+   struct pci_dev *dma_pdev = NULL, *pdev = to_pci_dev(dev);
struct iommu_dev_data *dev_data;
struct iommu_group *group;
u16 alias;
@@ -293,7 +293,9 @@ static int iommu_init_device(struct device *dev)
dev_data-alias_data = alias_data;
 
dma_pdev = pci_get_bus_and_slot(alias  8, alias  0xff);
-   } else
+   }
+
+   if (dma_pdev == NULL)
dma_pdev = pci_dev_get(pdev);
 
/* Account for quirked devices */

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 16:43 +0200, Roedel, Joerg wrote:
 Florian,
 
 On Wed, Sep 26, 2012 at 01:01:54AM +0200, Florian Dazinger wrote:
  you're right, either amd_iommu=off or removing the audio card makes
  the failure disappear. I will test the new BIOS rev. tomorrow.
 
 Can you please test this diff and report if it fixes the problem for
 you?
 
 Thanks.
 
 diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
 index b64502d..e89daf1 100644
 --- a/drivers/iommu/amd_iommu.c
 +++ b/drivers/iommu/amd_iommu.c
 @@ -266,7 +266,7 @@ static void swap_pci_ref(struct pci_dev **from, struct 
 pci_dev *to)
  
  static int iommu_init_device(struct device *dev)
  {
 - struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev);
 + struct pci_dev *dma_pdev = NULL, *pdev = to_pci_dev(dev);
   struct iommu_dev_data *dev_data;
   struct iommu_group *group;
   u16 alias;
 @@ -293,7 +293,9 @@ static int iommu_init_device(struct device *dev)
   dev_data-alias_data = alias_data;
  
   dma_pdev = pci_get_bus_and_slot(alias  8, alias  0xff);
 - } else
 + }
 +
 + if (dma_pdev == NULL)
   dma_pdev = pci_dev_get(pdev);
  
   /* Account for quirked devices */
 

Assuming this works, it may be ok as a 3.7 fix, but if there was
actually more than one device behind the alias we'd expose them as
separate iommu groups.  I don't think that's what we want.  Maybe it
should at least get a pr_warn.  Thanks,

Alex

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
 Assuming this works, it may be ok as a 3.7 fix, but if there was
 actually more than one device behind the alias we'd expose them as
 separate iommu groups.  I don't think that's what we want.  Maybe it
 should at least get a pr_warn.  Thanks,

True, we need something more generic as the real fix. When Florian
reports success I'll try to get this still into 3.6, otherwise to
-stable.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
 Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
 alias interface to iommu groups.  Seems like this could just be an extra
 parameter to iommu_group_get and iommu_group_add_device (empty in the
 typical case).  Then we have the problem of what's the type for an
 alias?  For AMI-Vi, it's a u16, but we need to be more generic than
 that.  Maybe iommu groups should just treat it as a void* so iommus can
 use a pointer to some structure or a fixed value like a u16 bus:slot.
 Thoughts?

Good question. The iommu-groups are part of the IOMMU-API, with an
interface to the IOMMU drivers and one to the users of IOMMU-API. So the
alias handling itself should be a function of the interface to the IOMMU
driver. In general the interface should not be bus specific.

So a void pointer seems the only logical choice then. But I would not
limit its scope to alias handling. How about making it a bus-private
pointer where IOMMU driver store bus-specific information. That way we
make sure that there is one struct per bus-type for this pointer, and
not one structure per IOMMU driver.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 17:04 +0200, Roedel, Joerg wrote:
 On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
  Assuming this works, it may be ok as a 3.7 fix, but if there was
  actually more than one device behind the alias we'd expose them as
  separate iommu groups.  I don't think that's what we want.  Maybe it
  should at least get a pr_warn.  Thanks,
 
 True, we need something more generic as the real fix. When Florian
 reports success I'll try to get this still into 3.6, otherwise to
 -stable.

Yes, 3.6 is what I meant to type.  Thanks,

Alex


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
 On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
  Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
  alias interface to iommu groups.  Seems like this could just be an extra
  parameter to iommu_group_get and iommu_group_add_device (empty in the
  typical case).  Then we have the problem of what's the type for an
  alias?  For AMI-Vi, it's a u16, but we need to be more generic than
  that.  Maybe iommu groups should just treat it as a void* so iommus can
  use a pointer to some structure or a fixed value like a u16 bus:slot.
  Thoughts?
 
 Good question. The iommu-groups are part of the IOMMU-API, with an
 interface to the IOMMU drivers and one to the users of IOMMU-API. So the
 alias handling itself should be a function of the interface to the IOMMU
 driver. In general the interface should not be bus specific.
 
 So a void pointer seems the only logical choice then. But I would not
 limit its scope to alias handling. How about making it a bus-private
 pointer where IOMMU driver store bus-specific information. That way we
 make sure that there is one struct per bus-type for this pointer, and
 not one structure per IOMMU driver.

I thought of another approach that may actually be more 3.6 worthy.
What if we just make the iommu driver handle it?  For instance,
amd_iommu can walk the alias table looking for entries that use the same
alias and get the device via pci_get_bus_and_slot.  If it finds a device
with an iommu group, it attaches the new device to the same group,
hiding anything about aliases from the group layer.  It just groups all
devices within the range.  I think the only complication is making sure
we're safe around device hotplug while we're doing this.  Thanks,

Alex


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Florian Dazinger
Am Wed, 26 Sep 2012 17:04:07 +0200
schrieb Roedel, Joerg joerg.roe...@amd.com:

 On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
  Assuming this works, it may be ok as a 3.7 fix, but if there was
  actually more than one device behind the alias we'd expose them as
  separate iommu groups.  I don't think that's what we want.  Maybe it
  should at least get a pr_warn.  Thanks,
 
 True, we need something more generic as the real fix. When Florian
 reports success I'll try to get this still into 3.6, otherwise to
 -stable.
 
 
   Joerg
 

it still fails with the card in a *different* slot. 
But with the patch applied, everything works, so this fixes it for me! Thanks a 
lot. Output of the relevant parts of dmesg and lspci see below. I'll still try 
the newest BIOS rev. and report back.
thx, Florian


dmesg (kernel-3.5.4):

[0.448252] RPC: Registered tcp NFSv4.1 backchannel transport module.
[1.471021] pci :01:00.0: Boot video device
[1.471118] PCI: CLS 64 bytes, default 64
[1.473864] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
[1.473902] AMD-Vi:mmio-addr: feb2
[1.474119] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
[1.474153] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
[1.474187] AMD-Vi:   DEV_SELECT  devid: 00:02.0 flags: 
00
[1.474220] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
[1.474254] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
[1.474287] AMD-Vi:   DEV_SELECT  devid: 00:04.0 flags: 
00
[1.474321] AMD-Vi:   DEV_SELECT  devid: 02:00.0 flags: 
00
[1.474354] AMD-Vi:   DEV_SELECT  devid: 00:05.0 flags: 
00
[1.474388] AMD-Vi:   DEV_SELECT  devid: 03:00.0 flags: 
00
[1.474421] AMD-Vi:   DEV_SELECT  devid: 00:06.0 flags: 
00
[1.474455] AMD-Vi:   DEV_SELECT  devid: 04:00.0 flags: 
00
[1.474488] AMD-Vi:   DEV_SELECT  devid: 00:07.0 flags: 
00
[1.474522] AMD-Vi:   DEV_SELECT  devid: 05:00.0 flags: 
00
[1.474555] AMD-Vi:   DEV_SELECT  devid: 00:09.0 flags: 
00
[1.474589] AMD-Vi:   DEV_SELECT  devid: 06:00.0 flags: 
00
[1.474622] AMD-Vi:   DEV_SELECT  devid: 00:0d.0 flags: 
00
[1.474656] AMD-Vi:   DEV_SELECT  devid: 07:00.0 flags: 
00
[1.474689] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 flags: 
00 devid_to: 08:00.0
[1.474726] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
[1.474764] AMD-Vi:   DEV_SELECT  devid: 00:11.0 flags: 
00
[1.474798] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
[1.474836] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
[1.474870] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
[1.474903] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
[1.474937] AMD-Vi:   DEV_SELECT  devid: 00:14.0 flags: 
d7
[1.474970] AMD-Vi:   DEV_SELECT  devid: 00:14.3 flags: 
00
[1.475004] AMD-Vi:   DEV_SELECT  devid: 00:14.4 flags: 
00
[1.475038] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 flags: 
00 devid_to: 00:14.4
[1.475074] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
[1.475112] AMD-Vi:   DEV_SELECT  devid: 00:14.5 flags: 
00
[1.475146] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
[1.475180] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
[1.475271] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
[1.529007] pci :00:00.2: irq 72 for MSI/MSI-X
[1.539126] AMD-Vi: Lazy IO/TLB flushing enabled
[1.539750] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[1.539787] software IO TLB [mem 0xc9728000-0xcd727fff] (64MB) mapped at 
[8800c9728000-8800cd727fff]
[1.539957] kvm: Nested Virtualization enabled

lspci (kernel-3.5.4):
00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
bridge (external gfx0 port B) (rev 02)
Subsystem: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge 
(external gfx0 port B)
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- 
MAbort+ SERR- PERR- INTx-
Capabilities: [f0] HyperTransport: MSI Mapping Enable+ Fixed+
Capabilities: [c4] HyperTransport: Slave or Primary Interface
Command: BaseUnitID=0 UnitCnt=20 MastHost- DefDir- DUL-
Link Control 0: CFlE- CST- CFE- LkFail- Init+ EOC- TXO- 
CRCErr=0 IsocEn- LSEn- ExtCTL- 64b-
Link Config 0: MLWI=16bit DwFcIn- MLWO=16bit DwFcOut- LWI=16bit 
DwFcInEn- LWO=16bit DwFcOutEn-
Link Control 1: 

Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Florian Dazinger
Am Wed, 26 Sep 2012 17:04:07 +0200
schrieb Roedel, Joerg joerg.roe...@amd.com:

 On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
  Assuming this works, it may be ok as a 3.7 fix, but if there was
  actually more than one device behind the alias we'd expose them as
  separate iommu groups.  I don't think that's what we want.  Maybe it
  should at least get a pr_warn.  Thanks,
 
 True, we need something more generic as the real fix. When Florian
 reports success I'll try to get this still into 3.6, otherwise to
 -stable.
 
 
   Joerg
 

 ... updating to the newest BIOS revision does not make any difference, rc7 is 
crashing, rc7+patch is not, I definitely need this patch.
If there is more you want me to test, pls tell.

Florian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote:
 On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
  On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
   Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
   alias interface to iommu groups.  Seems like this could just be an extra
   parameter to iommu_group_get and iommu_group_add_device (empty in the
   typical case).  Then we have the problem of what's the type for an
   alias?  For AMI-Vi, it's a u16, but we need to be more generic than
   that.  Maybe iommu groups should just treat it as a void* so iommus can
   use a pointer to some structure or a fixed value like a u16 bus:slot.
   Thoughts?
  
  Good question. The iommu-groups are part of the IOMMU-API, with an
  interface to the IOMMU drivers and one to the users of IOMMU-API. So the
  alias handling itself should be a function of the interface to the IOMMU
  driver. In general the interface should not be bus specific.
  
  So a void pointer seems the only logical choice then. But I would not
  limit its scope to alias handling. How about making it a bus-private
  pointer where IOMMU driver store bus-specific information. That way we
  make sure that there is one struct per bus-type for this pointer, and
  not one structure per IOMMU driver.
 
 I thought of another approach that may actually be more 3.6 worthy.
 What if we just make the iommu driver handle it?  For instance,
 amd_iommu can walk the alias table looking for entries that use the same
 alias and get the device via pci_get_bus_and_slot.  If it finds a device
 with an iommu group, it attaches the new device to the same group,
 hiding anything about aliases from the group layer.  It just groups all
 devices within the range.  I think the only complication is making sure
 we're safe around device hotplug while we're doing this.  Thanks,

I think this could work.  Instead of searching for other devices, check
for or allocate an iommu group on the alias dev_data, any virtual
aliases use that iommu group.  Florian, could you test this as well?
Thanks,

Alex

Signed-off-by: Alex Williamson alex.william...@redhat.com
---

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b64502d..22879ed 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -126,6 +126,8 @@ static void free_dev_data(struct iommu_dev_data *dev_data)
 
spin_lock_irqsave(dev_data_list_lock, flags);
list_del(dev_data-dev_data_list);
+   if (dev_data-group)
+   iommu_group_put(dev_data-group);
spin_unlock_irqrestore(dev_data_list_lock, flags);
 
kfree(dev_data);
@@ -256,6 +258,37 @@ static bool check_device(struct device *dev)
return true;
 }
 
+/*
+ * Sometimes there's no actual device for an alias.  When that happens
+ * we allocate an iommu group on the iommu_dev_data so that it gets used
+ * by anything with the same alias.  We keep the reference from
+ * iommu_group_alloc so the group persists with the iommu_dev_data.
+ */
+static int dev_data_add_iommu_group(struct iommu_dev_data *dev_data,
+   struct device *dev)
+{
+   unsigned long flags;
+   struct iommu_group *group;
+   int ret = 0;
+
+   spin_lock_irqsave(dev_data_list_lock, flags);
+   if (!dev_data-group) {
+   group = iommu_group_alloc();
+   if (IS_ERR(group)) {
+   ret = PTR_ERR(group);
+   goto unlock;
+   }
+
+   dev_data-group = group;
+   } else
+   group = dev_data-group;
+
+   ret = iommu_group_add_device(group, dev);
+unlock:
+   spin_unlock_irqrestore(dev_data_list_lock, flags);
+   return ret;
+}
+
 static void swap_pci_ref(struct pci_dev **from, struct pci_dev *to)
 {
pci_dev_put(*from);
@@ -264,38 +297,12 @@ static void swap_pci_ref(struct pci_dev **from, struct 
pci_dev *to)
 
 #define REQ_ACS_FLAGS  (PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF)
 
-static int iommu_init_device(struct device *dev)
+static int pdev_add_iommu_group(struct pci_dev *pdev, struct device *dev)
 {
-   struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev);
-   struct iommu_dev_data *dev_data;
+   struct pci_dev *dma_pdev = pdev;
struct iommu_group *group;
-   u16 alias;
int ret;
 
-   if (dev-archdata.iommu)
-   return 0;
-
-   dev_data = find_dev_data(get_device_id(dev));
-   if (!dev_data)
-   return -ENOMEM;
-
-   alias = amd_iommu_alias_table[dev_data-devid];
-   if (alias != dev_data-devid) {
-   struct iommu_dev_data *alias_data;
-
-   alias_data = find_dev_data(alias);
-   if (alias_data == NULL) {
-   pr_err(AMD-Vi: Warning: Unhandled device %s\n,
-   dev_name(dev));
-   free_dev_data(dev_data);
- 

Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 13:50 -0600, Alex Williamson wrote:
 On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote:
  On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
   On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
alias interface to iommu groups.  Seems like this could just be an extra
parameter to iommu_group_get and iommu_group_add_device (empty in the
typical case).  Then we have the problem of what's the type for an
alias?  For AMI-Vi, it's a u16, but we need to be more generic than
that.  Maybe iommu groups should just treat it as a void* so iommus can
use a pointer to some structure or a fixed value like a u16 bus:slot.
Thoughts?
   
   Good question. The iommu-groups are part of the IOMMU-API, with an
   interface to the IOMMU drivers and one to the users of IOMMU-API. So the
   alias handling itself should be a function of the interface to the IOMMU
   driver. In general the interface should not be bus specific.
   
   So a void pointer seems the only logical choice then. But I would not
   limit its scope to alias handling. How about making it a bus-private
   pointer where IOMMU driver store bus-specific information. That way we
   make sure that there is one struct per bus-type for this pointer, and
   not one structure per IOMMU driver.
  
  I thought of another approach that may actually be more 3.6 worthy.
  What if we just make the iommu driver handle it?  For instance,
  amd_iommu can walk the alias table looking for entries that use the same
  alias and get the device via pci_get_bus_and_slot.  If it finds a device
  with an iommu group, it attaches the new device to the same group,
  hiding anything about aliases from the group layer.  It just groups all
  devices within the range.  I think the only complication is making sure
  we're safe around device hotplug while we're doing this.  Thanks,
 
 I think this could work.  Instead of searching for other devices, check
 for or allocate an iommu group on the alias dev_data, any virtual
 aliases use that iommu group.  Florian, could you test this as well?

Here's a lockdep clean version of it:

amd_iommu: Handle aliases not backed by devices

Aliases sometimes don't have a struct pci_dev backing them.  This breaks
our attempt to figure out the topology and device quirks that may effect
IOMMU grouping.  When this happens, allocate an IOMMU group on the
dev_data for the alias and make use of it for all devices referencing
this alias.

Signed-off-by: Alex Williamson alex.william...@redhat.com
---

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b64502d..4eacb17 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -71,6 +71,7 @@ static DEFINE_SPINLOCK(iommu_pd_list_lock);
 /* List of all available dev_data structures */
 static LIST_HEAD(dev_data_list);
 static DEFINE_SPINLOCK(dev_data_list_lock);
+static DEFINE_MUTEX(dev_data_iommu_group_lock);
 
 /*
  * Domain for untranslated devices - only allocated
@@ -128,6 +129,9 @@ static void free_dev_data(struct iommu_dev_data *dev_data)
list_del(dev_data-dev_data_list);
spin_unlock_irqrestore(dev_data_list_lock, flags);
 
+   if (dev_data-group)
+   iommu_group_put(dev_data-group);
+
kfree(dev_data);
 }
 
@@ -256,6 +260,34 @@ static bool check_device(struct device *dev)
return true;
 }
 
+/*
+ * Sometimes there's no actual device for an alias.  When that happens
+ * we allocate an iommu group on the dev_data and use it for anything
+ * aliasing back to this device.  This makes sure that multiple devices
+ * aliased to a non-existent device id all get grouped together.  Hold
+ * on to the reference for the group, it can be static rather than get
+ * automatically reclaimed if this device later gets removed.
+ */
+static int dev_data_add_iommu_group(struct iommu_dev_data *dev_data,
+   struct device *dev)
+{
+   mutex_lock(dev_data_iommu_group_lock);
+
+   if (!dev_data-group) {
+   struct iommu_group *group = iommu_group_alloc();
+   if (IS_ERR(group)) {
+   mutex_unlock(dev_data_iommu_group_lock);
+   return PTR_ERR(group);
+   }
+
+   dev_data-group = group;
+   }
+
+   mutex_unlock(dev_data_iommu_group_lock);
+
+   return iommu_group_add_device(dev_data-group, dev);
+}
+
 static void swap_pci_ref(struct pci_dev **from, struct pci_dev *to)
 {
pci_dev_put(*from);
@@ -264,38 +296,17 @@ static void swap_pci_ref(struct pci_dev **from, struct 
pci_dev *to)
 
 #define REQ_ACS_FLAGS  (PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF)
 
-static int iommu_init_device(struct device *dev)
+/*
+ * Given a pci device, look at device quirks and topology between it
+ * and the IOMMU to determine the IOMMU group.  Once we've found or
+ * created 

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Wed, 2012-09-26 at 01:01 +0200, Florian Dazinger wrote:
> Am Tue, 25 Sep 2012 13:43:46 -0600
> schrieb Alex Williamson :
> 
> > On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote:
> > > Am Tue, 25 Sep 2012 12:32:50 -0600
> > > schrieb Alex Williamson :
> > > 
> > > > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> > > > > Hi,
> > > > > I think I've found a regression, which causes an early boot crash, I
> > > > > appended the kernel output via jpg file, since I do not have a serial
> > > > > console or sth.
> > > > > 
> > > > > after bisection, it boils down to this commit:
> > > > > 
> > > > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> > > > > commit 9dcd61303af862c279df86aa97fde7ce371be774
> > > > > Author: Alex Williamson 
> > > > > Date:   Wed May 30 14:19:07 2012 -0600
> > > > > 
> > > > > amd_iommu: Support IOMMU groups
> > > > > 
> > > > > Add IOMMU group support to AMD-Vi device init and uninit code.
> > > > > Existing notifiers make sure this gets called for each device.
> > > > > 
> > > > > Signed-off-by: Alex Williamson 
> > > > > Signed-off-by: Joerg Roedel 
> > > > > 
> > > > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> > > > > 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
> > > > 
> > > > [switching back to mailing list thread]
> > > > 
> > > > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
> > > > 
> > > > [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 
> > > > 1300
> > > > [1.485683] AMD-Vi:mmio-addr: feb2
> > > > [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 
> > > > 00
> > > > [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
> > > > [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
> > > > flags: 00
> > > > [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 
> > > > 00
> > > > [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
> > > > [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
> > > > flags: 00
> > > > [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
> > > > flags: 00
> > > > [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
> > > > flags: 00
> > > > [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
> > > > flags: 00
> > > > [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
> > > > flags: 00
> > > > [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
> > > > flags: 00
> > > > [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
> > > > flags: 00
> > > > [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
> > > > flags: 00
> > > > [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
> > > > flags: 00
> > > > [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
> > > > flags: 00
> > > > [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
> > > > flags: 00
> > > > [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
> > > > flags: 00
> > > > [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
> > > > flags: 00 devid_to: 08:00.0
> > > > [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
> > > > [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
> > > > flags: 00
> > > > [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 
> > > > 00
> > > > [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
> > > > [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 
> > > > 00
> > > > [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
> > > > [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
> > > > flags: d7
> > > > [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
> > > > flags: 00
> > > > [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
> > > > flags: 00
> > > > [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
> > > > flags: 00 devid_to: 00:14.4
> > > > [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
> > > > [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
> > > > flags: 00
> > > > [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 
> > > > 00
> > > > [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
> > > > [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
> > > > 
> > > > 
> > > > > lspci:
> > > > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI 
> > > > > to PCI bridge (external gfx0 port B) (rev 02)
> > > > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
> > > > > Management Unit (IOMMU)
> > > > > 00:02.0 PCI bridge: Advanced Micro 

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Florian Dazinger
Am Tue, 25 Sep 2012 13:43:46 -0600
schrieb Alex Williamson :

> On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote:
> > Am Tue, 25 Sep 2012 12:32:50 -0600
> > schrieb Alex Williamson :
> > 
> > > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> > > > Hi,
> > > > I think I've found a regression, which causes an early boot crash, I
> > > > appended the kernel output via jpg file, since I do not have a serial
> > > > console or sth.
> > > > 
> > > > after bisection, it boils down to this commit:
> > > > 
> > > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> > > > commit 9dcd61303af862c279df86aa97fde7ce371be774
> > > > Author: Alex Williamson 
> > > > Date:   Wed May 30 14:19:07 2012 -0600
> > > > 
> > > > amd_iommu: Support IOMMU groups
> > > > 
> > > > Add IOMMU group support to AMD-Vi device init and uninit code.
> > > > Existing notifiers make sure this gets called for each device.
> > > > 
> > > > Signed-off-by: Alex Williamson 
> > > > Signed-off-by: Joerg Roedel 
> > > > 
> > > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> > > > 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
> > > 
> > > [switching back to mailing list thread]
> > > 
> > > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
> > > 
> > > [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 
> > > 1300
> > > [1.485683] AMD-Vi:mmio-addr: feb2
> > > [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
> > > [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
> > > [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
> > > flags: 00
> > > [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
> > > [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
> > > [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
> > > flags: 00
> > > [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
> > > flags: 00
> > > [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
> > > flags: 00
> > > [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
> > > flags: 00
> > > [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
> > > flags: 00
> > > [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
> > > flags: 00
> > > [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
> > > flags: 00
> > > [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
> > > flags: 00
> > > [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
> > > flags: 00
> > > [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
> > > flags: 00
> > > [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
> > > flags: 00
> > > [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
> > > flags: 00
> > > [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
> > > flags: 00 devid_to: 08:00.0
> > > [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
> > > [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
> > > flags: 00
> > > [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
> > > [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
> > > [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
> > > [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
> > > [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
> > > flags: d7
> > > [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
> > > flags: 00
> > > [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
> > > flags: 00
> > > [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
> > > flags: 00 devid_to: 00:14.4
> > > [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
> > > [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
> > > flags: 00
> > > [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
> > > [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
> > > [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
> > > 
> > > 
> > > > lspci:
> > > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
> > > > PCI bridge (external gfx0 port B) (rev 02)
> > > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
> > > > Management Unit (IOMMU)
> > > > 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
> > > > PCI bridge (PCI express gpp port B)
> > > > 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
> > > > PCI bridge (PCI express gpp port D)
> > > > 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
> > > > PCI bridge (PCI 

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote:
> Am Tue, 25 Sep 2012 12:32:50 -0600
> schrieb Alex Williamson :
> 
> > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> > > Hi,
> > > I think I've found a regression, which causes an early boot crash, I
> > > appended the kernel output via jpg file, since I do not have a serial
> > > console or sth.
> > > 
> > > after bisection, it boils down to this commit:
> > > 
> > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> > > commit 9dcd61303af862c279df86aa97fde7ce371be774
> > > Author: Alex Williamson 
> > > Date:   Wed May 30 14:19:07 2012 -0600
> > > 
> > > amd_iommu: Support IOMMU groups
> > > 
> > > Add IOMMU group support to AMD-Vi device init and uninit code.
> > > Existing notifiers make sure this gets called for each device.
> > > 
> > > Signed-off-by: Alex Williamson 
> > > Signed-off-by: Joerg Roedel 
> > > 
> > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> > > 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
> > 
> > [switching back to mailing list thread]
> > 
> > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
> > 
> > [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
> > [1.485683] AMD-Vi:mmio-addr: feb2
> > [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
> > [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
> > [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
> > flags: 00
> > [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
> > [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
> > [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
> > flags: 00
> > [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
> > flags: 00
> > [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
> > flags: 00
> > [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
> > flags: 00
> > [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
> > flags: 00
> > [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
> > flags: 00
> > [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
> > flags: 00
> > [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
> > flags: 00
> > [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
> > flags: 00
> > [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
> > flags: 00
> > [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
> > flags: 00
> > [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
> > flags: 00
> > [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
> > flags: 00 devid_to: 08:00.0
> > [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
> > [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
> > flags: 00
> > [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
> > [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
> > [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
> > [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
> > [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
> > flags: d7
> > [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
> > flags: 00
> > [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
> > flags: 00
> > [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
> > flags: 00 devid_to: 00:14.4
> > [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
> > [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
> > flags: 00
> > [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
> > [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
> > [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
> > 
> > 
> > > lspci:
> > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
> > > PCI bridge (external gfx0 port B) (rev 02)
> > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
> > > Management Unit (IOMMU)
> > > 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > > bridge (PCI express gpp port B)
> > > 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > > bridge (PCI express gpp port D)
> > > 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > > bridge (PCI express gpp port E)
> > > 00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > > bridge (PCI express gpp port F)
> > > 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > > bridge (PCI express gpp port G)
> > > 

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Tue, 2012-09-25 at 12:32 -0600, Alex Williamson wrote:
> On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> > Hi,
> > I think I've found a regression, which causes an early boot crash, I
> > appended the kernel output via jpg file, since I do not have a serial
> > console or sth.
> > 
> > after bisection, it boils down to this commit:
> > 
> > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> > commit 9dcd61303af862c279df86aa97fde7ce371be774
> > Author: Alex Williamson 
> > Date:   Wed May 30 14:19:07 2012 -0600
> > 
> > amd_iommu: Support IOMMU groups
> > 
> > Add IOMMU group support to AMD-Vi device init and uninit code.
> > Existing notifiers make sure this gets called for each device.
> > 
> > Signed-off-by: Alex Williamson 
> > Signed-off-by: Joerg Roedel 
> > 
> > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> > 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
> 
> [switching back to mailing list thread]
> 
> I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
> 
> [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
> [1.485683] AMD-Vi:mmio-addr: feb2
> [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
> [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
> [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
> flags: 00
> [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
> [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
> [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
> flags: 00
> [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
> flags: 00
> [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
> flags: 00
> [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
> flags: 00
> [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
> flags: 00
> [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
> flags: 00
> [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
> flags: 00
> [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
> flags: 00
> [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
> flags: 00
> [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
> flags: 00
> [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
> flags: 00
> [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
> flags: 00
> [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
> flags: 00 devid_to: 08:00.0
> [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
> [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
> flags: 00
> [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
> [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
> [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
> [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
> [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
> flags: d7
> [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
> flags: 00
> [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
> flags: 00
> [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
> flags: 00 devid_to: 00:14.4
> [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
> [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
> flags: 00
> [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
> [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
> [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
> 
> 
> > lspci:
> > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (external gfx0 port B) (rev 02)
> > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
> > Management Unit (IOMMU)
> > 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port B)
> > 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port D)
> > 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port E)
> > 00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port F)
> > 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port G)
> > 00:09.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port H)
> > 00:0b.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (NB-SB link)
> > 00:11.0 SATA controller: Advanced Micro Devices [AMD] nee 

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> Hi,
> I think I've found a regression, which causes an early boot crash, I
> appended the kernel output via jpg file, since I do not have a serial
> console or sth.
> 
> after bisection, it boils down to this commit:
> 
> 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> commit 9dcd61303af862c279df86aa97fde7ce371be774
> Author: Alex Williamson 
> Date:   Wed May 30 14:19:07 2012 -0600
> 
> amd_iommu: Support IOMMU groups
> 
> Add IOMMU group support to AMD-Vi device init and uninit code.
> Existing notifiers make sure this gets called for each device.
> 
> Signed-off-by: Alex Williamson 
> Signed-off-by: Joerg Roedel 
> 
> :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers

[switching back to mailing list thread]

I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:

[1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
[1.485683] AMD-Vi:mmio-addr: feb2
[1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
[1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
[1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 flags: 
00
[1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
[1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
[1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 flags: 
00
[1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 flags: 
00
[1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 flags: 
00
[1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 flags: 
00
[1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 flags: 
00
[1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 flags: 
00
[1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 flags: 
00
[1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 flags: 
00
[1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 flags: 
00
[1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 flags: 
00
[1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 flags: 
00
[1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 flags: 
00
[1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 flags: 
00 devid_to: 08:00.0
[1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
[1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 flags: 
00
[1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
[1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
[1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
[1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
[1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 flags: 
d7
[1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 flags: 
00
[1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 flags: 
00
[1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 flags: 
00 devid_to: 00:14.4
[1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
[1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 flags: 
00
[1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
[1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
[1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40


> lspci:
> 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (external gfx0 port B) (rev 02)
> 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
> Management Unit (IOMMU)
> 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port B)
> 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port D)
> 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port E)
> 00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port F)
> 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port G)
> 00:09.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port H)
> 00:0b.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (NB-SB link)
> 00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI 
> SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
> 00:12.0 USB controller: Advanced Micro Devices [AMD] nee ATI 
> SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> 00:12.2 USB controller: Advanced Micro Devices [AMD] nee ATI 
> SB7x0/SB8x0/SB9x0 USB EHCI 

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
 Hi,
 I think I've found a regression, which causes an early boot crash, I
 appended the kernel output via jpg file, since I do not have a serial
 console or sth.
 
 after bisection, it boils down to this commit:
 
 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
 commit 9dcd61303af862c279df86aa97fde7ce371be774
 Author: Alex Williamson alex.william...@redhat.com
 Date:   Wed May 30 14:19:07 2012 -0600
 
 amd_iommu: Support IOMMU groups
 
 Add IOMMU group support to AMD-Vi device init and uninit code.
 Existing notifiers make sure this gets called for each device.
 
 Signed-off-by: Alex Williamson alex.william...@redhat.com
 Signed-off-by: Joerg Roedel joerg.roe...@amd.com
 
 :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers

[switching back to mailing list thread]

I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:

[1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
[1.485683] AMD-Vi:mmio-addr: feb2
[1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
[1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
[1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 flags: 
00
[1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
[1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
[1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 flags: 
00
[1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 flags: 
00
[1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 flags: 
00
[1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 flags: 
00
[1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 flags: 
00
[1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 flags: 
00
[1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 flags: 
00
[1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 flags: 
00
[1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 flags: 
00
[1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 flags: 
00
[1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 flags: 
00
[1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 flags: 
00
[1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 flags: 
00 devid_to: 08:00.0
[1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
[1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 flags: 
00
[1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
[1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
[1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
[1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
[1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 flags: 
d7
[1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 flags: 
00
[1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 flags: 
00
[1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 flags: 
00 devid_to: 00:14.4
[1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
[1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 flags: 
00
[1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
[1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
[1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40


 lspci:
 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
 bridge (external gfx0 port B) (rev 02)
 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
 Management Unit (IOMMU)
 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
 bridge (PCI express gpp port B)
 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
 bridge (PCI express gpp port D)
 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
 bridge (PCI express gpp port E)
 00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
 bridge (PCI express gpp port F)
 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
 bridge (PCI express gpp port G)
 00:09.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
 bridge (PCI express gpp port H)
 00:0b.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
 bridge (NB-SB link)
 00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI 
 SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
 00:12.0 USB controller: Advanced Micro Devices [AMD] nee ATI 
 SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
 00:12.2 USB controller: Advanced Micro Devices [AMD] nee ATI 
 

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Tue, 2012-09-25 at 12:32 -0600, Alex Williamson wrote:
 On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
  Hi,
  I think I've found a regression, which causes an early boot crash, I
  appended the kernel output via jpg file, since I do not have a serial
  console or sth.
  
  after bisection, it boils down to this commit:
  
  9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
  commit 9dcd61303af862c279df86aa97fde7ce371be774
  Author: Alex Williamson alex.william...@redhat.com
  Date:   Wed May 30 14:19:07 2012 -0600
  
  amd_iommu: Support IOMMU groups
  
  Add IOMMU group support to AMD-Vi device init and uninit code.
  Existing notifiers make sure this gets called for each device.
  
  Signed-off-by: Alex Williamson alex.william...@redhat.com
  Signed-off-by: Joerg Roedel joerg.roe...@amd.com
  
  :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
  837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
 
 [switching back to mailing list thread]
 
 I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
 
 [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
 [1.485683] AMD-Vi:mmio-addr: feb2
 [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
 [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
 [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
 flags: 00
 [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
 [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
 [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
 flags: 00
 [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
 flags: 00
 [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
 flags: 00
 [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
 flags: 00
 [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
 flags: 00
 [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
 flags: 00
 [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
 flags: 00
 [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
 flags: 00
 [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
 flags: 00
 [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
 flags: 00
 [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
 flags: 00
 [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
 flags: 00
 [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
 flags: 00 devid_to: 08:00.0
 [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
 [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
 flags: 00
 [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
 [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
 [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
 [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
 [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
 flags: d7
 [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
 flags: 00
 [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
 flags: 00
 [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
 flags: 00 devid_to: 00:14.4
 [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
 [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
 flags: 00
 [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
 [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
 [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
 
 
  lspci:
  00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
  bridge (external gfx0 port B) (rev 02)
  00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
  Management Unit (IOMMU)
  00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
  bridge (PCI express gpp port B)
  00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
  bridge (PCI express gpp port D)
  00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
  bridge (PCI express gpp port E)
  00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
  bridge (PCI express gpp port F)
  00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
  bridge (PCI express gpp port G)
  00:09.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
  bridge (PCI express gpp port H)
  00:0b.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
  bridge (NB-SB link)
  00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI 
  SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
  00:12.0 USB 

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote:
 Am Tue, 25 Sep 2012 12:32:50 -0600
 schrieb Alex Williamson alex.william...@redhat.com:
 
  On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
   Hi,
   I think I've found a regression, which causes an early boot crash, I
   appended the kernel output via jpg file, since I do not have a serial
   console or sth.
   
   after bisection, it boils down to this commit:
   
   9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
   commit 9dcd61303af862c279df86aa97fde7ce371be774
   Author: Alex Williamson alex.william...@redhat.com
   Date:   Wed May 30 14:19:07 2012 -0600
   
   amd_iommu: Support IOMMU groups
   
   Add IOMMU group support to AMD-Vi device init and uninit code.
   Existing notifiers make sure this gets called for each device.
   
   Signed-off-by: Alex Williamson alex.william...@redhat.com
   Signed-off-by: Joerg Roedel joerg.roe...@amd.com
   
   :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
   837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
  
  [switching back to mailing list thread]
  
  I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
  
  [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
  [1.485683] AMD-Vi:mmio-addr: feb2
  [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
  [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
  [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
  flags: 00
  [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
  [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
  [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
  flags: 00
  [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
  flags: 00
  [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
  flags: 00
  [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
  flags: 00
  [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
  flags: 00
  [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
  flags: 00
  [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
  flags: 00
  [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
  flags: 00
  [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
  flags: 00
  [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
  flags: 00
  [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
  flags: 00
  [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
  flags: 00
  [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
  flags: 00 devid_to: 08:00.0
  [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
  [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
  flags: 00
  [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
  [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
  [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
  [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
  [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
  flags: d7
  [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
  flags: 00
  [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
  flags: 00
  [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
  flags: 00 devid_to: 00:14.4
  [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
  [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
  flags: 00
  [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
  [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
  [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
  
  
   lspci:
   00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
   PCI bridge (external gfx0 port B) (rev 02)
   00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
   Management Unit (IOMMU)
   00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
   bridge (PCI express gpp port B)
   00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
   bridge (PCI express gpp port D)
   00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
   bridge (PCI express gpp port E)
   00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
   bridge (PCI express gpp port F)
   00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
   bridge (PCI express gpp port G)
   00:09.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
   bridge (PCI express gpp port H)
   00:0b.0 PCI bridge: Advanced Micro 

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Florian Dazinger
Am Tue, 25 Sep 2012 13:43:46 -0600
schrieb Alex Williamson alex.william...@redhat.com:

 On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote:
  Am Tue, 25 Sep 2012 12:32:50 -0600
  schrieb Alex Williamson alex.william...@redhat.com:
  
   On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
Hi,
I think I've found a regression, which causes an early boot crash, I
appended the kernel output via jpg file, since I do not have a serial
console or sth.

after bisection, it boils down to this commit:

9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
commit 9dcd61303af862c279df86aa97fde7ce371be774
Author: Alex Williamson alex.william...@redhat.com
Date:   Wed May 30 14:19:07 2012 -0600

amd_iommu: Support IOMMU groups

Add IOMMU group support to AMD-Vi device init and uninit code.
Existing notifiers make sure this gets called for each device.

Signed-off-by: Alex Williamson alex.william...@redhat.com
Signed-off-by: Joerg Roedel joerg.roe...@amd.com

:04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
   
   [switching back to mailing list thread]
   
   I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
   
   [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 
   1300
   [1.485683] AMD-Vi:mmio-addr: feb2
   [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
   [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
   [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
   flags: 00
   [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
   [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
   [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
   flags: 00
   [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
   flags: 00
   [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
   flags: 00
   [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
   flags: 00
   [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
   flags: 00
   [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
   flags: 00
   [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
   flags: 00
   [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
   flags: 00
   [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
   flags: 00
   [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
   flags: 00
   [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
   flags: 00
   [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
   flags: 00
   [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
   flags: 00 devid_to: 08:00.0
   [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
   [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
   flags: 00
   [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
   [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
   [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
   [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
   [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
   flags: d7
   [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
   flags: 00
   [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
   flags: 00
   [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
   flags: 00 devid_to: 00:14.4
   [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
   [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
   flags: 00
   [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
   [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
   [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
   
   
lspci:
00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
PCI bridge (external gfx0 port B) (rev 02)
00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
Management Unit (IOMMU)
00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
PCI bridge (PCI express gpp port B)
00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
PCI bridge (PCI express gpp port D)
00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
PCI bridge (PCI express gpp port E)
00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
PCI bridge (PCI express gpp port F)
00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI 

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Wed, 2012-09-26 at 01:01 +0200, Florian Dazinger wrote:
 Am Tue, 25 Sep 2012 13:43:46 -0600
 schrieb Alex Williamson alex.william...@redhat.com:
 
  On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote:
   Am Tue, 25 Sep 2012 12:32:50 -0600
   schrieb Alex Williamson alex.william...@redhat.com:
   
On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
 Hi,
 I think I've found a regression, which causes an early boot crash, I
 appended the kernel output via jpg file, since I do not have a serial
 console or sth.
 
 after bisection, it boils down to this commit:
 
 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
 commit 9dcd61303af862c279df86aa97fde7ce371be774
 Author: Alex Williamson alex.william...@redhat.com
 Date:   Wed May 30 14:19:07 2012 -0600
 
 amd_iommu: Support IOMMU groups
 
 Add IOMMU group support to AMD-Vi device init and uninit code.
 Existing notifiers make sure this gets called for each device.
 
 Signed-off-by: Alex Williamson alex.william...@redhat.com
 Signed-off-by: Joerg Roedel joerg.roe...@amd.com
 
 :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers

[switching back to mailing list thread]

I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:

[1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 
1300
[1.485683] AMD-Vi:mmio-addr: feb2
[1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 
00
[1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
[1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
flags: 00
[1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 
00
[1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
[1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
flags: 00
[1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
flags: 00
[1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
flags: 00
[1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
flags: 00
[1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
flags: 00
[1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
flags: 00
[1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
flags: 00
[1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
flags: 00
[1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
flags: 00
[1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
flags: 00
[1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
flags: 00
[1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
flags: 00
[1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
flags: 00 devid_to: 08:00.0
[1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
[1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
flags: 00
[1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 
00
[1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
[1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 
00
[1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
[1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
flags: d7
[1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
flags: 00
[1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
flags: 00
[1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
flags: 00 devid_to: 00:14.4
[1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
[1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
flags: 00
[1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 
00
[1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
[1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40


 lspci:
 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI 
 to PCI bridge (external gfx0 port B) (rev 02)
 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
 Management Unit (IOMMU)
 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
 PCI bridge (PCI express gpp port B)
 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
 PCI bridge (PCI express gpp port D)
 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
 PCI bridge (PCI express gpp port E)