Re: 3.6-rc7 boot crash + bisection

2012-09-28 Thread Roedel, Joerg
On Wed, Sep 26, 2012 at 04:04:03PM -0600, Alex Williamson wrote:

> Here's a lockdep clean version of it:
> 
> amd_iommu: Handle aliases not backed by devices
> 
> Aliases sometimes don't have a struct pci_dev backing them.  This breaks
> our attempt to figure out the topology and device quirks that may effect
> IOMMU grouping.  When this happens, allocate an IOMMU group on the
> dev_data for the alias and make use of it for all devices referencing
> this alias.

Yes, this is the real fix. But it is too big for v3.6 at this time, so
I'll would take this for 3.7 and use my small fix for 3.6.

> Signed-off-by: Alex Williamson 
> ---
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index b64502d..4eacb17 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -71,6 +71,7 @@ static DEFINE_SPINLOCK(iommu_pd_list_lock);
>  /* List of all available dev_data structures */
>  static LIST_HEAD(dev_data_list);
>  static DEFINE_SPINLOCK(dev_data_list_lock);
> +static DEFINE_MUTEX(dev_data_iommu_group_lock);

I think this lock is not necessary. The iommu_init_device routine does
not run multiple times in parallel for the same device. So we should be
safe on that side.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-27 Thread Florian Dazinger
Am Wed, 26 Sep 2012 16:04:03 -0600
schrieb Alex Williamson :

> On Wed, 2012-09-26 at 13:50 -0600, Alex Williamson wrote:
> > On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote:
> > > On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
> > > > On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
> > > > > Hmm, that throws a kink in iommu groups.  So perhaps we need to make 
> > > > > an
> > > > > alias interface to iommu groups.  Seems like this could just be an 
> > > > > extra
> > > > > parameter to iommu_group_get and iommu_group_add_device (empty in the
> > > > > typical case).  Then we have the problem of what's the type for an
> > > > > alias?  For AMI-Vi, it's a u16, but we need to be more generic than
> > > > > that.  Maybe iommu groups should just treat it as a void* so iommus 
> > > > > can
> > > > > use a pointer to some structure or a fixed value like a u16 bus:slot.
> > > > > Thoughts?
> > > > 
> > > > Good question. The iommu-groups are part of the IOMMU-API, with an
> > > > interface to the IOMMU drivers and one to the users of IOMMU-API. So the
> > > > alias handling itself should be a function of the interface to the IOMMU
> > > > driver. In general the interface should not be bus specific.
> > > > 
> > > > So a void pointer seems the only logical choice then. But I would not
> > > > limit its scope to alias handling. How about making it a bus-private
> > > > pointer where IOMMU driver store bus-specific information. That way we
> > > > make sure that there is one struct per bus-type for this pointer, and
> > > > not one structure per IOMMU driver.
> > > 
> > > I thought of another approach that may actually be more 3.6 worthy.
> > > What if we just make the iommu driver handle it?  For instance,
> > > amd_iommu can walk the alias table looking for entries that use the same
> > > alias and get the device via pci_get_bus_and_slot.  If it finds a device
> > > with an iommu group, it attaches the new device to the same group,
> > > hiding anything about aliases from the group layer.  It just groups all
> > > devices within the range.  I think the only complication is making sure
> > > we're safe around device hotplug while we're doing this.  Thanks,
> > 
> > I think this could work.  Instead of searching for other devices, check
> > for or allocate an iommu group on the alias dev_data, any "virtual"
> > aliases use that iommu group.  Florian, could you test this as well?
> 
> Here's a lockdep clean version of it:
> 
> amd_iommu: Handle aliases not backed by devices
> 
[ skipped patch ]

yes, this patch is working for me, too. I also tested your second patch, it was 
working as well.
thanks, Florian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 13:50 -0600, Alex Williamson wrote:
> On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote:
> > On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
> > > On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
> > > > Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
> > > > alias interface to iommu groups.  Seems like this could just be an extra
> > > > parameter to iommu_group_get and iommu_group_add_device (empty in the
> > > > typical case).  Then we have the problem of what's the type for an
> > > > alias?  For AMI-Vi, it's a u16, but we need to be more generic than
> > > > that.  Maybe iommu groups should just treat it as a void* so iommus can
> > > > use a pointer to some structure or a fixed value like a u16 bus:slot.
> > > > Thoughts?
> > > 
> > > Good question. The iommu-groups are part of the IOMMU-API, with an
> > > interface to the IOMMU drivers and one to the users of IOMMU-API. So the
> > > alias handling itself should be a function of the interface to the IOMMU
> > > driver. In general the interface should not be bus specific.
> > > 
> > > So a void pointer seems the only logical choice then. But I would not
> > > limit its scope to alias handling. How about making it a bus-private
> > > pointer where IOMMU driver store bus-specific information. That way we
> > > make sure that there is one struct per bus-type for this pointer, and
> > > not one structure per IOMMU driver.
> > 
> > I thought of another approach that may actually be more 3.6 worthy.
> > What if we just make the iommu driver handle it?  For instance,
> > amd_iommu can walk the alias table looking for entries that use the same
> > alias and get the device via pci_get_bus_and_slot.  If it finds a device
> > with an iommu group, it attaches the new device to the same group,
> > hiding anything about aliases from the group layer.  It just groups all
> > devices within the range.  I think the only complication is making sure
> > we're safe around device hotplug while we're doing this.  Thanks,
> 
> I think this could work.  Instead of searching for other devices, check
> for or allocate an iommu group on the alias dev_data, any "virtual"
> aliases use that iommu group.  Florian, could you test this as well?

Here's a lockdep clean version of it:

amd_iommu: Handle aliases not backed by devices

Aliases sometimes don't have a struct pci_dev backing them.  This breaks
our attempt to figure out the topology and device quirks that may effect
IOMMU grouping.  When this happens, allocate an IOMMU group on the
dev_data for the alias and make use of it for all devices referencing
this alias.

Signed-off-by: Alex Williamson 
---

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b64502d..4eacb17 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -71,6 +71,7 @@ static DEFINE_SPINLOCK(iommu_pd_list_lock);
 /* List of all available dev_data structures */
 static LIST_HEAD(dev_data_list);
 static DEFINE_SPINLOCK(dev_data_list_lock);
+static DEFINE_MUTEX(dev_data_iommu_group_lock);
 
 /*
  * Domain for untranslated devices - only allocated
@@ -128,6 +129,9 @@ static void free_dev_data(struct iommu_dev_data *dev_data)
list_del(&dev_data->dev_data_list);
spin_unlock_irqrestore(&dev_data_list_lock, flags);
 
+   if (dev_data->group)
+   iommu_group_put(dev_data->group);
+
kfree(dev_data);
 }
 
@@ -256,6 +260,34 @@ static bool check_device(struct device *dev)
return true;
 }
 
+/*
+ * Sometimes there's no actual device for an alias.  When that happens
+ * we allocate an iommu group on the dev_data and use it for anything
+ * aliasing back to this device.  This makes sure that multiple devices
+ * aliased to a non-existent device id all get grouped together.  Hold
+ * on to the reference for the group, it can be static rather than get
+ * automatically reclaimed if this device later gets removed.
+ */
+static int dev_data_add_iommu_group(struct iommu_dev_data *dev_data,
+   struct device *dev)
+{
+   mutex_lock(&dev_data_iommu_group_lock);
+
+   if (!dev_data->group) {
+   struct iommu_group *group = iommu_group_alloc();
+   if (IS_ERR(group)) {
+   mutex_unlock(&dev_data_iommu_group_lock);
+   return PTR_ERR(group);
+   }
+
+   dev_data->group = group;
+   }
+
+   mutex_unlock(&dev_data_iommu_group_lock);
+
+   return iommu_group_add_device(dev_data->group, dev);
+}
+
 static void swap_pci_ref(struct pci_dev **from, struct pci_dev *to)
 {
pci_dev_put(*from);
@@ -264,38 +296,17 @@ static void swap_pci_ref(struct pci_dev **from, struct 
pci_dev *to)
 
 #define REQ_ACS_FLAGS  (PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF)
 
-static int iommu_init_device(struct device *dev)
+/*
+ * Given a pci device, look at device quirks and topology between it
+ *

Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote:
> On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
> > On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
> > > Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
> > > alias interface to iommu groups.  Seems like this could just be an extra
> > > parameter to iommu_group_get and iommu_group_add_device (empty in the
> > > typical case).  Then we have the problem of what's the type for an
> > > alias?  For AMI-Vi, it's a u16, but we need to be more generic than
> > > that.  Maybe iommu groups should just treat it as a void* so iommus can
> > > use a pointer to some structure or a fixed value like a u16 bus:slot.
> > > Thoughts?
> > 
> > Good question. The iommu-groups are part of the IOMMU-API, with an
> > interface to the IOMMU drivers and one to the users of IOMMU-API. So the
> > alias handling itself should be a function of the interface to the IOMMU
> > driver. In general the interface should not be bus specific.
> > 
> > So a void pointer seems the only logical choice then. But I would not
> > limit its scope to alias handling. How about making it a bus-private
> > pointer where IOMMU driver store bus-specific information. That way we
> > make sure that there is one struct per bus-type for this pointer, and
> > not one structure per IOMMU driver.
> 
> I thought of another approach that may actually be more 3.6 worthy.
> What if we just make the iommu driver handle it?  For instance,
> amd_iommu can walk the alias table looking for entries that use the same
> alias and get the device via pci_get_bus_and_slot.  If it finds a device
> with an iommu group, it attaches the new device to the same group,
> hiding anything about aliases from the group layer.  It just groups all
> devices within the range.  I think the only complication is making sure
> we're safe around device hotplug while we're doing this.  Thanks,

I think this could work.  Instead of searching for other devices, check
for or allocate an iommu group on the alias dev_data, any "virtual"
aliases use that iommu group.  Florian, could you test this as well?
Thanks,

Alex

Signed-off-by: Alex Williamson 
---

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b64502d..22879ed 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -126,6 +126,8 @@ static void free_dev_data(struct iommu_dev_data *dev_data)
 
spin_lock_irqsave(&dev_data_list_lock, flags);
list_del(&dev_data->dev_data_list);
+   if (dev_data->group)
+   iommu_group_put(dev_data->group);
spin_unlock_irqrestore(&dev_data_list_lock, flags);
 
kfree(dev_data);
@@ -256,6 +258,37 @@ static bool check_device(struct device *dev)
return true;
 }
 
+/*
+ * Sometimes there's no actual device for an alias.  When that happens
+ * we allocate an iommu group on the iommu_dev_data so that it gets used
+ * by anything with the same alias.  We keep the reference from
+ * iommu_group_alloc so the group persists with the iommu_dev_data.
+ */
+static int dev_data_add_iommu_group(struct iommu_dev_data *dev_data,
+   struct device *dev)
+{
+   unsigned long flags;
+   struct iommu_group *group;
+   int ret = 0;
+
+   spin_lock_irqsave(&dev_data_list_lock, flags);
+   if (!dev_data->group) {
+   group = iommu_group_alloc();
+   if (IS_ERR(group)) {
+   ret = PTR_ERR(group);
+   goto unlock;
+   }
+
+   dev_data->group = group;
+   } else
+   group = dev_data->group;
+
+   ret = iommu_group_add_device(group, dev);
+unlock:
+   spin_unlock_irqrestore(&dev_data_list_lock, flags);
+   return ret;
+}
+
 static void swap_pci_ref(struct pci_dev **from, struct pci_dev *to)
 {
pci_dev_put(*from);
@@ -264,38 +297,12 @@ static void swap_pci_ref(struct pci_dev **from, struct 
pci_dev *to)
 
 #define REQ_ACS_FLAGS  (PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF)
 
-static int iommu_init_device(struct device *dev)
+static int pdev_add_iommu_group(struct pci_dev *pdev, struct device *dev)
 {
-   struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev);
-   struct iommu_dev_data *dev_data;
+   struct pci_dev *dma_pdev = pdev;
struct iommu_group *group;
-   u16 alias;
int ret;
 
-   if (dev->archdata.iommu)
-   return 0;
-
-   dev_data = find_dev_data(get_device_id(dev));
-   if (!dev_data)
-   return -ENOMEM;
-
-   alias = amd_iommu_alias_table[dev_data->devid];
-   if (alias != dev_data->devid) {
-   struct iommu_dev_data *alias_data;
-
-   alias_data = find_dev_data(alias);
-   if (alias_data == NULL) {
-   pr_err("AMD-Vi: Warning: Unhandled device %s\n",
-   dev_name(dev));
-

Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Florian Dazinger
Am Wed, 26 Sep 2012 17:04:07 +0200
schrieb "Roedel, Joerg" :

> On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
> > Assuming this works, it may be ok as a 3.7 fix, but if there was
> > actually more than one device behind the alias we'd expose them as
> > separate iommu groups.  I don't think that's what we want.  Maybe it
> > should at least get a pr_warn.  Thanks,
> 
> True, we need something more generic as the real fix. When Florian
> reports success I'll try to get this still into 3.6, otherwise to
> -stable.
> 
> 
>   Joerg
> 

 ... updating to the newest BIOS revision does not make any difference, rc7 is 
crashing, rc7+patch is not, I definitely need this patch.
If there is more you want me to test, pls tell.

Florian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Florian Dazinger
Am Wed, 26 Sep 2012 17:04:07 +0200
schrieb "Roedel, Joerg" :

> On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
> > Assuming this works, it may be ok as a 3.7 fix, but if there was
> > actually more than one device behind the alias we'd expose them as
> > separate iommu groups.  I don't think that's what we want.  Maybe it
> > should at least get a pr_warn.  Thanks,
> 
> True, we need something more generic as the real fix. When Florian
> reports success I'll try to get this still into 3.6, otherwise to
> -stable.
> 
> 
>   Joerg
> 

it still fails with the card in a *different* slot. 
But with the patch applied, everything works, so this fixes it for me! Thanks a 
lot. Output of the relevant parts of dmesg and lspci see below. I'll still try 
the newest BIOS rev. and report back.
thx, Florian


dmesg (kernel-3.5.4):

[0.448252] RPC: Registered tcp NFSv4.1 backchannel transport module.
[1.471021] pci :01:00.0: Boot video device
[1.471118] PCI: CLS 64 bytes, default 64
[1.473864] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
[1.473902] AMD-Vi:mmio-addr: feb2
[1.474119] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
[1.474153] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
[1.474187] AMD-Vi:   DEV_SELECT  devid: 00:02.0 flags: 
00
[1.474220] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
[1.474254] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
[1.474287] AMD-Vi:   DEV_SELECT  devid: 00:04.0 flags: 
00
[1.474321] AMD-Vi:   DEV_SELECT  devid: 02:00.0 flags: 
00
[1.474354] AMD-Vi:   DEV_SELECT  devid: 00:05.0 flags: 
00
[1.474388] AMD-Vi:   DEV_SELECT  devid: 03:00.0 flags: 
00
[1.474421] AMD-Vi:   DEV_SELECT  devid: 00:06.0 flags: 
00
[1.474455] AMD-Vi:   DEV_SELECT  devid: 04:00.0 flags: 
00
[1.474488] AMD-Vi:   DEV_SELECT  devid: 00:07.0 flags: 
00
[1.474522] AMD-Vi:   DEV_SELECT  devid: 05:00.0 flags: 
00
[1.474555] AMD-Vi:   DEV_SELECT  devid: 00:09.0 flags: 
00
[1.474589] AMD-Vi:   DEV_SELECT  devid: 06:00.0 flags: 
00
[1.474622] AMD-Vi:   DEV_SELECT  devid: 00:0d.0 flags: 
00
[1.474656] AMD-Vi:   DEV_SELECT  devid: 07:00.0 flags: 
00
[1.474689] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 flags: 
00 devid_to: 08:00.0
[1.474726] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
[1.474764] AMD-Vi:   DEV_SELECT  devid: 00:11.0 flags: 
00
[1.474798] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
[1.474836] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
[1.474870] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
[1.474903] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
[1.474937] AMD-Vi:   DEV_SELECT  devid: 00:14.0 flags: 
d7
[1.474970] AMD-Vi:   DEV_SELECT  devid: 00:14.3 flags: 
00
[1.475004] AMD-Vi:   DEV_SELECT  devid: 00:14.4 flags: 
00
[1.475038] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 flags: 
00 devid_to: 00:14.4
[1.475074] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
[1.475112] AMD-Vi:   DEV_SELECT  devid: 00:14.5 flags: 
00
[1.475146] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
[1.475180] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
[1.475271] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
[1.529007] pci :00:00.2: irq 72 for MSI/MSI-X
[1.539126] AMD-Vi: Lazy IO/TLB flushing enabled
[1.539750] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[1.539787] software IO TLB [mem 0xc9728000-0xcd727fff] (64MB) mapped at 
[8800c9728000-8800cd727fff]
[1.539957] kvm: Nested Virtualization enabled

lspci (kernel-3.5.4):
00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
bridge (external gfx0 port B) (rev 02)
Subsystem: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge 
(external gfx0 port B)
Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- 
Capabilities: [54] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: fee0f00c  Data: 41a9
Capabilities: [64] HyperTransport: MSI Mapping Enable+ Fixed+

00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
bridge (PCI express gpp port B) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- 

Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote:
> On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
> > Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
> > alias interface to iommu groups.  Seems like this could just be an extra
> > parameter to iommu_group_get and iommu_group_add_device (empty in the
> > typical case).  Then we have the problem of what's the type for an
> > alias?  For AMI-Vi, it's a u16, but we need to be more generic than
> > that.  Maybe iommu groups should just treat it as a void* so iommus can
> > use a pointer to some structure or a fixed value like a u16 bus:slot.
> > Thoughts?
> 
> Good question. The iommu-groups are part of the IOMMU-API, with an
> interface to the IOMMU drivers and one to the users of IOMMU-API. So the
> alias handling itself should be a function of the interface to the IOMMU
> driver. In general the interface should not be bus specific.
> 
> So a void pointer seems the only logical choice then. But I would not
> limit its scope to alias handling. How about making it a bus-private
> pointer where IOMMU driver store bus-specific information. That way we
> make sure that there is one struct per bus-type for this pointer, and
> not one structure per IOMMU driver.

I thought of another approach that may actually be more 3.6 worthy.
What if we just make the iommu driver handle it?  For instance,
amd_iommu can walk the alias table looking for entries that use the same
alias and get the device via pci_get_bus_and_slot.  If it finds a device
with an iommu group, it attaches the new device to the same group,
hiding anything about aliases from the group layer.  It just groups all
devices within the range.  I think the only complication is making sure
we're safe around device hotplug while we're doing this.  Thanks,

Alex


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 17:04 +0200, Roedel, Joerg wrote:
> On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
> > Assuming this works, it may be ok as a 3.7 fix, but if there was
> > actually more than one device behind the alias we'd expose them as
> > separate iommu groups.  I don't think that's what we want.  Maybe it
> > should at least get a pr_warn.  Thanks,
> 
> True, we need something more generic as the real fix. When Florian
> reports success I'll try to get this still into 3.6, otherwise to
> -stable.

Yes, 3.6 is what I meant to type.  Thanks,

Alex


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote:
> Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
> alias interface to iommu groups.  Seems like this could just be an extra
> parameter to iommu_group_get and iommu_group_add_device (empty in the
> typical case).  Then we have the problem of what's the type for an
> alias?  For AMI-Vi, it's a u16, but we need to be more generic than
> that.  Maybe iommu groups should just treat it as a void* so iommus can
> use a pointer to some structure or a fixed value like a u16 bus:slot.
> Thoughts?

Good question. The iommu-groups are part of the IOMMU-API, with an
interface to the IOMMU drivers and one to the users of IOMMU-API. So the
alias handling itself should be a function of the interface to the IOMMU
driver. In general the interface should not be bus specific.

So a void pointer seems the only logical choice then. But I would not
limit its scope to alias handling. How about making it a bus-private
pointer where IOMMU driver store bus-specific information. That way we
make sure that there is one struct per bus-type for this pointer, and
not one structure per IOMMU driver.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote:
> Assuming this works, it may be ok as a 3.7 fix, but if there was
> actually more than one device behind the alias we'd expose them as
> separate iommu groups.  I don't think that's what we want.  Maybe it
> should at least get a pr_warn.  Thanks,

True, we need something more generic as the real fix. When Florian
reports success I'll try to get this still into 3.6, otherwise to
-stable.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 16:43 +0200, Roedel, Joerg wrote:
> Florian,
> 
> On Wed, Sep 26, 2012 at 01:01:54AM +0200, Florian Dazinger wrote:
> > you're right, either "amd_iommu=off" or removing the audio card makes
> > the failure disappear. I will test the new BIOS rev. tomorrow.
> 
> Can you please test this diff and report if it fixes the problem for
> you?
> 
> Thanks.
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index b64502d..e89daf1 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -266,7 +266,7 @@ static void swap_pci_ref(struct pci_dev **from, struct 
> pci_dev *to)
>  
>  static int iommu_init_device(struct device *dev)
>  {
> - struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev);
> + struct pci_dev *dma_pdev = NULL, *pdev = to_pci_dev(dev);
>   struct iommu_dev_data *dev_data;
>   struct iommu_group *group;
>   u16 alias;
> @@ -293,7 +293,9 @@ static int iommu_init_device(struct device *dev)
>   dev_data->alias_data = alias_data;
>  
>   dma_pdev = pci_get_bus_and_slot(alias >> 8, alias & 0xff);
> - } else
> + }
> +
> + if (dma_pdev == NULL)
>   dma_pdev = pci_dev_get(pdev);
>  
>   /* Account for quirked devices */
> 

Assuming this works, it may be ok as a 3.7 fix, but if there was
actually more than one device behind the alias we'd expose them as
separate iommu groups.  I don't think that's what we want.  Maybe it
should at least get a pr_warn.  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
Florian,

On Wed, Sep 26, 2012 at 01:01:54AM +0200, Florian Dazinger wrote:
> you're right, either "amd_iommu=off" or removing the audio card makes
> the failure disappear. I will test the new BIOS rev. tomorrow.

Can you please test this diff and report if it fixes the problem for
you?

Thanks.

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index b64502d..e89daf1 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -266,7 +266,7 @@ static void swap_pci_ref(struct pci_dev **from, struct 
pci_dev *to)
 
 static int iommu_init_device(struct device *dev)
 {
-   struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev);
+   struct pci_dev *dma_pdev = NULL, *pdev = to_pci_dev(dev);
struct iommu_dev_data *dev_data;
struct iommu_group *group;
u16 alias;
@@ -293,7 +293,9 @@ static int iommu_init_device(struct device *dev)
dev_data->alias_data = alias_data;
 
dma_pdev = pci_get_bus_and_slot(alias >> 8, alias & 0xff);
-   } else
+   }
+
+   if (dma_pdev == NULL)
dma_pdev = pci_dev_get(pdev);
 
/* Account for quirked devices */

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Alex Williamson
On Wed, 2012-09-26 at 15:20 +0200, Roedel, Joerg wrote:
> On Tue, Sep 25, 2012 at 01:43:46PM -0600, Alex Williamson wrote:
> > Joerg, any thoughts on a quirk for this?  Unfortunately we can't just
> > skip IOMMU groups when an alias is broken because it puts the other
> > IOMMU groups at risk that might not actually be isolated from this
> > device.  It looks like we parse the alias info before PCI is probed, so
> > maybe we'd need to call the quirk from iommu_init_device itself.
> 
> I fear that the BIOS does everything right and device 08:04.0 is indeed
> using 08:00.0 as request-id. There are a couple of devices where this
> happens, usually when the vendor just took the old 32bit PCI chip, added
> a transparent PCIe-to-PCI bridge to the device and sell it a PCIe.
> 
> So the assumption that every request-id has a corresponding pci_dev
> structure does not hold. I also had made that assumption in the
> AMD IOMMU driver but had to add code which removes that assumption. We
> should look for a way to remove that assumption from the group-code too.

Hmm, that throws a kink in iommu groups.  So perhaps we need to make an
alias interface to iommu groups.  Seems like this could just be an extra
parameter to iommu_group_get and iommu_group_add_device (empty in the
typical case).  Then we have the problem of what's the type for an
alias?  For AMI-Vi, it's a u16, but we need to be more generic than
that.  Maybe iommu groups should just treat it as a void* so iommus can
use a pointer to some structure or a fixed value like a u16 bus:slot.
Thoughts?  Thanks,

Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-26 Thread Roedel, Joerg
On Tue, Sep 25, 2012 at 01:43:46PM -0600, Alex Williamson wrote:
> Joerg, any thoughts on a quirk for this?  Unfortunately we can't just
> skip IOMMU groups when an alias is broken because it puts the other
> IOMMU groups at risk that might not actually be isolated from this
> device.  It looks like we parse the alias info before PCI is probed, so
> maybe we'd need to call the quirk from iommu_init_device itself.

I fear that the BIOS does everything right and device 08:04.0 is indeed
using 08:00.0 as request-id. There are a couple of devices where this
happens, usually when the vendor just took the old 32bit PCI chip, added
a transparent PCIe-to-PCI bridge to the device and sell it a PCIe.

So the assumption that every request-id has a corresponding pci_dev
structure does not hold. I also had made that assumption in the
AMD IOMMU driver but had to add code which removes that assumption. We
should look for a way to remove that assumption from the group-code too.


Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Wed, 2012-09-26 at 01:01 +0200, Florian Dazinger wrote:
> Am Tue, 25 Sep 2012 13:43:46 -0600
> schrieb Alex Williamson :
> 
> > On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote:
> > > Am Tue, 25 Sep 2012 12:32:50 -0600
> > > schrieb Alex Williamson :
> > > 
> > > > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> > > > > Hi,
> > > > > I think I've found a regression, which causes an early boot crash, I
> > > > > appended the kernel output via jpg file, since I do not have a serial
> > > > > console or sth.
> > > > > 
> > > > > after bisection, it boils down to this commit:
> > > > > 
> > > > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> > > > > commit 9dcd61303af862c279df86aa97fde7ce371be774
> > > > > Author: Alex Williamson 
> > > > > Date:   Wed May 30 14:19:07 2012 -0600
> > > > > 
> > > > > amd_iommu: Support IOMMU groups
> > > > > 
> > > > > Add IOMMU group support to AMD-Vi device init and uninit code.
> > > > > Existing notifiers make sure this gets called for each device.
> > > > > 
> > > > > Signed-off-by: Alex Williamson 
> > > > > Signed-off-by: Joerg Roedel 
> > > > > 
> > > > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> > > > > 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
> > > > 
> > > > [switching back to mailing list thread]
> > > > 
> > > > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
> > > > 
> > > > [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 
> > > > 1300
> > > > [1.485683] AMD-Vi:mmio-addr: feb2
> > > > [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 
> > > > 00
> > > > [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
> > > > [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
> > > > flags: 00
> > > > [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 
> > > > 00
> > > > [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
> > > > [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
> > > > flags: 00
> > > > [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
> > > > flags: 00
> > > > [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
> > > > flags: 00
> > > > [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
> > > > flags: 00
> > > > [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
> > > > flags: 00
> > > > [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
> > > > flags: 00
> > > > [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
> > > > flags: 00
> > > > [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
> > > > flags: 00
> > > > [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
> > > > flags: 00
> > > > [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
> > > > flags: 00
> > > > [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
> > > > flags: 00
> > > > [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
> > > > flags: 00
> > > > [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
> > > > flags: 00 devid_to: 08:00.0
> > > > [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
> > > > [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
> > > > flags: 00
> > > > [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 
> > > > 00
> > > > [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
> > > > [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 
> > > > 00
> > > > [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
> > > > [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
> > > > flags: d7
> > > > [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
> > > > flags: 00
> > > > [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
> > > > flags: 00
> > > > [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
> > > > flags: 00 devid_to: 00:14.4
> > > > [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
> > > > [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
> > > > flags: 00
> > > > [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 
> > > > 00
> > > > [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
> > > > [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
> > > > 
> > > > 
> > > > > lspci:
> > > > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI 
> > > > > to PCI bridge (external gfx0 port B) (rev 02)
> > > > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
> > > > > Management Unit (IOMMU)
> > > > > 00:02.0 PCI bridge: Advanced Micro Devic

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Florian Dazinger
Am Tue, 25 Sep 2012 13:43:46 -0600
schrieb Alex Williamson :

> On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote:
> > Am Tue, 25 Sep 2012 12:32:50 -0600
> > schrieb Alex Williamson :
> > 
> > > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> > > > Hi,
> > > > I think I've found a regression, which causes an early boot crash, I
> > > > appended the kernel output via jpg file, since I do not have a serial
> > > > console or sth.
> > > > 
> > > > after bisection, it boils down to this commit:
> > > > 
> > > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> > > > commit 9dcd61303af862c279df86aa97fde7ce371be774
> > > > Author: Alex Williamson 
> > > > Date:   Wed May 30 14:19:07 2012 -0600
> > > > 
> > > > amd_iommu: Support IOMMU groups
> > > > 
> > > > Add IOMMU group support to AMD-Vi device init and uninit code.
> > > > Existing notifiers make sure this gets called for each device.
> > > > 
> > > > Signed-off-by: Alex Williamson 
> > > > Signed-off-by: Joerg Roedel 
> > > > 
> > > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> > > > 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
> > > 
> > > [switching back to mailing list thread]
> > > 
> > > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
> > > 
> > > [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 
> > > 1300
> > > [1.485683] AMD-Vi:mmio-addr: feb2
> > > [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
> > > [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
> > > [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
> > > flags: 00
> > > [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
> > > [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
> > > [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
> > > flags: 00
> > > [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
> > > flags: 00
> > > [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
> > > flags: 00
> > > [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
> > > flags: 00
> > > [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
> > > flags: 00
> > > [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
> > > flags: 00
> > > [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
> > > flags: 00
> > > [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
> > > flags: 00
> > > [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
> > > flags: 00
> > > [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
> > > flags: 00
> > > [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
> > > flags: 00
> > > [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
> > > flags: 00
> > > [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
> > > flags: 00 devid_to: 08:00.0
> > > [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
> > > [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
> > > flags: 00
> > > [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
> > > [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
> > > [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
> > > [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
> > > [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
> > > flags: d7
> > > [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
> > > flags: 00
> > > [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
> > > flags: 00
> > > [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
> > > flags: 00 devid_to: 00:14.4
> > > [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
> > > [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
> > > flags: 00
> > > [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
> > > [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
> > > [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
> > > 
> > > 
> > > > lspci:
> > > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
> > > > PCI bridge (external gfx0 port B) (rev 02)
> > > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
> > > > Management Unit (IOMMU)
> > > > 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
> > > > PCI bridge (PCI express gpp port B)
> > > > 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
> > > > PCI bridge (PCI express gpp port D)
> > > > 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
> > > > PCI bridge (PCI e

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote:
> Am Tue, 25 Sep 2012 12:32:50 -0600
> schrieb Alex Williamson :
> 
> > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> > > Hi,
> > > I think I've found a regression, which causes an early boot crash, I
> > > appended the kernel output via jpg file, since I do not have a serial
> > > console or sth.
> > > 
> > > after bisection, it boils down to this commit:
> > > 
> > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> > > commit 9dcd61303af862c279df86aa97fde7ce371be774
> > > Author: Alex Williamson 
> > > Date:   Wed May 30 14:19:07 2012 -0600
> > > 
> > > amd_iommu: Support IOMMU groups
> > > 
> > > Add IOMMU group support to AMD-Vi device init and uninit code.
> > > Existing notifiers make sure this gets called for each device.
> > > 
> > > Signed-off-by: Alex Williamson 
> > > Signed-off-by: Joerg Roedel 
> > > 
> > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> > > 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
> > 
> > [switching back to mailing list thread]
> > 
> > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
> > 
> > [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
> > [1.485683] AMD-Vi:mmio-addr: feb2
> > [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
> > [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
> > [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
> > flags: 00
> > [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
> > [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
> > [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
> > flags: 00
> > [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
> > flags: 00
> > [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
> > flags: 00
> > [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
> > flags: 00
> > [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
> > flags: 00
> > [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
> > flags: 00
> > [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
> > flags: 00
> > [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
> > flags: 00
> > [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
> > flags: 00
> > [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
> > flags: 00
> > [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
> > flags: 00
> > [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
> > flags: 00
> > [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
> > flags: 00 devid_to: 08:00.0
> > [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
> > [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
> > flags: 00
> > [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
> > [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
> > [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
> > [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
> > [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
> > flags: d7
> > [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
> > flags: 00
> > [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
> > flags: 00
> > [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
> > flags: 00 devid_to: 00:14.4
> > [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
> > [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
> > flags: 00
> > [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
> > [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
> > [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
> > 
> > 
> > > lspci:
> > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to 
> > > PCI bridge (external gfx0 port B) (rev 02)
> > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
> > > Management Unit (IOMMU)
> > > 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > > bridge (PCI express gpp port B)
> > > 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > > bridge (PCI express gpp port D)
> > > 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > > bridge (PCI express gpp port E)
> > > 00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > > bridge (PCI express gpp port F)
> > > 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > > bridge (PCI express gpp port G)
> > > 00:09.

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Tue, 2012-09-25 at 12:32 -0600, Alex Williamson wrote:
> On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> > Hi,
> > I think I've found a regression, which causes an early boot crash, I
> > appended the kernel output via jpg file, since I do not have a serial
> > console or sth.
> > 
> > after bisection, it boils down to this commit:
> > 
> > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> > commit 9dcd61303af862c279df86aa97fde7ce371be774
> > Author: Alex Williamson 
> > Date:   Wed May 30 14:19:07 2012 -0600
> > 
> > amd_iommu: Support IOMMU groups
> > 
> > Add IOMMU group support to AMD-Vi device init and uninit code.
> > Existing notifiers make sure this gets called for each device.
> > 
> > Signed-off-by: Alex Williamson 
> > Signed-off-by: Joerg Roedel 
> > 
> > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> > 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers
> 
> [switching back to mailing list thread]
> 
> I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:
> 
> [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
> [1.485683] AMD-Vi:mmio-addr: feb2
> [1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
> [1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
> [1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 
> flags: 00
> [1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
> [1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
> [1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 
> flags: 00
> [1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 
> flags: 00
> [1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 
> flags: 00
> [1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 
> flags: 00
> [1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 
> flags: 00
> [1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 
> flags: 00
> [1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 
> flags: 00
> [1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 
> flags: 00
> [1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 
> flags: 00
> [1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 
> flags: 00
> [1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 
> flags: 00
> [1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 
> flags: 00
> [1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 
> flags: 00 devid_to: 08:00.0
> [1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
> [1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 
> flags: 00
> [1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
> [1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
> [1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
> [1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
> [1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 
> flags: d7
> [1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 
> flags: 00
> [1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 
> flags: 00
> [1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 
> flags: 00 devid_to: 00:14.4
> [1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
> [1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 
> flags: 00
> [1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
> [1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
> [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40
> 
> 
> > lspci:
> > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (external gfx0 port B) (rev 02)
> > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
> > Management Unit (IOMMU)
> > 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port B)
> > 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port D)
> > 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port E)
> > 00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port F)
> > 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port G)
> > 00:09.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (PCI express gpp port H)
> > 00:0b.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> > bridge (NB-SB link)
> > 00:11.0 SATA controller: Advanced Micro Devices [AMD] nee A

Re: 3.6-rc7 boot crash + bisection

2012-09-25 Thread Alex Williamson
On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote:
> Hi,
> I think I've found a regression, which causes an early boot crash, I
> appended the kernel output via jpg file, since I do not have a serial
> console or sth.
> 
> after bisection, it boils down to this commit:
> 
> 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit
> commit 9dcd61303af862c279df86aa97fde7ce371be774
> Author: Alex Williamson 
> Date:   Wed May 30 14:19:07 2012 -0600
> 
> amd_iommu: Support IOMMU groups
> 
> Add IOMMU group support to AMD-Vi device init and uninit code.
> Existing notifiers make sure this gets called for each device.
> 
> Signed-off-by: Alex Williamson 
> Signed-off-by: Joerg Roedel 
> 
> :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060
> 837ae95e84f6d3553457c4df595a9caa56843c03 M  drivers

[switching back to mailing list thread]

I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines:

[1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300
[1.485683] AMD-Vi:mmio-addr: feb2
[1.485901] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:00.0 flags: 00
[1.485935] AMD-Vi:   DEV_RANGE_END   devid: 00:00.2
[1.485969] AMD-Vi:   DEV_SELECT  devid: 00:02.0 flags: 
00
[1.486002] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 01:00.0 flags: 00
[1.486036] AMD-Vi:   DEV_RANGE_END   devid: 01:00.1
[1.486070] AMD-Vi:   DEV_SELECT  devid: 00:04.0 flags: 
00
[1.486103] AMD-Vi:   DEV_SELECT  devid: 02:00.0 flags: 
00
[1.486137] AMD-Vi:   DEV_SELECT  devid: 00:05.0 flags: 
00
[1.486170] AMD-Vi:   DEV_SELECT  devid: 03:00.0 flags: 
00
[1.486204] AMD-Vi:   DEV_SELECT  devid: 00:06.0 flags: 
00
[1.486238] AMD-Vi:   DEV_SELECT  devid: 04:00.0 flags: 
00
[1.486271] AMD-Vi:   DEV_SELECT  devid: 00:07.0 flags: 
00
[1.486305] AMD-Vi:   DEV_SELECT  devid: 05:00.0 flags: 
00
[1.486338] AMD-Vi:   DEV_SELECT  devid: 00:09.0 flags: 
00
[1.486372] AMD-Vi:   DEV_SELECT  devid: 06:00.0 flags: 
00
[1.486406] AMD-Vi:   DEV_SELECT  devid: 00:0b.0 flags: 
00
[1.486439] AMD-Vi:   DEV_SELECT  devid: 07:00.0 flags: 
00
[1.486473] AMD-Vi:   DEV_ALIAS_RANGE devid: 08:01.0 flags: 
00 devid_to: 08:00.0
[1.486510] AMD-Vi:   DEV_RANGE_END   devid: 08:1f.7
[1.486548] AMD-Vi:   DEV_SELECT  devid: 00:11.0 flags: 
00
[1.486581] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:12.0 flags: 00
[1.486620] AMD-Vi:   DEV_RANGE_END   devid: 00:12.2
[1.486654] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:13.0 flags: 00
[1.486688] AMD-Vi:   DEV_RANGE_END   devid: 00:13.2
[1.486721] AMD-Vi:   DEV_SELECT  devid: 00:14.0 flags: 
d7
[1.486755] AMD-Vi:   DEV_SELECT  devid: 00:14.3 flags: 
00
[1.486788] AMD-Vi:   DEV_SELECT  devid: 00:14.4 flags: 
00
[1.486822] AMD-Vi:   DEV_ALIAS_RANGE devid: 09:00.0 flags: 
00 devid_to: 00:14.4
[1.486859] AMD-Vi:   DEV_RANGE_END   devid: 09:1f.7
[1.486897] AMD-Vi:   DEV_SELECT  devid: 00:14.5 flags: 
00
[1.486931] AMD-Vi:   DEV_SELECT_RANGE_START  devid: 00:16.0 flags: 00
[1.486965] AMD-Vi:   DEV_RANGE_END   devid: 00:16.2
[1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40


> lspci:
> 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (external gfx0 port B) (rev 02)
> 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory 
> Management Unit (IOMMU)
> 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port B)
> 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port D)
> 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port E)
> 00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port F)
> 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port G)
> 00:09.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (PCI express gpp port H)
> 00:0b.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI 
> bridge (NB-SB link)
> 00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI 
> SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)
> 00:12.0 USB controller: Advanced Micro Devices [AMD] nee ATI 
> SB7x0/SB8x0/SB9x0 USB OHCI0 Controller
> 00:12.2 USB controller: Advanced Micro Devices [AMD] nee ATI 
> SB7x0/SB8x0/SB9x0 USB EHCI Controlle