Re: 3.6-rc7 boot crash + bisection
On Wed, Sep 26, 2012 at 04:04:03PM -0600, Alex Williamson wrote: > Here's a lockdep clean version of it: > > amd_iommu: Handle aliases not backed by devices > > Aliases sometimes don't have a struct pci_dev backing them. This breaks > our attempt to figure out the topology and device quirks that may effect > IOMMU grouping. When this happens, allocate an IOMMU group on the > dev_data for the alias and make use of it for all devices referencing > this alias. Yes, this is the real fix. But it is too big for v3.6 at this time, so I'll would take this for 3.7 and use my small fix for 3.6. > Signed-off-by: Alex Williamson > --- > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index b64502d..4eacb17 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -71,6 +71,7 @@ static DEFINE_SPINLOCK(iommu_pd_list_lock); > /* List of all available dev_data structures */ > static LIST_HEAD(dev_data_list); > static DEFINE_SPINLOCK(dev_data_list_lock); > +static DEFINE_MUTEX(dev_data_iommu_group_lock); I think this lock is not necessary. The iommu_init_device routine does not run multiple times in parallel for the same device. So we should be safe on that side. Joerg -- AMD Operating System Research Center Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach General Managers: Alberto Bozzo Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6-rc7 boot crash + bisection
Am Wed, 26 Sep 2012 16:04:03 -0600 schrieb Alex Williamson : > On Wed, 2012-09-26 at 13:50 -0600, Alex Williamson wrote: > > On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote: > > > On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote: > > > > On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote: > > > > > Hmm, that throws a kink in iommu groups. So perhaps we need to make > > > > > an > > > > > alias interface to iommu groups. Seems like this could just be an > > > > > extra > > > > > parameter to iommu_group_get and iommu_group_add_device (empty in the > > > > > typical case). Then we have the problem of what's the type for an > > > > > alias? For AMI-Vi, it's a u16, but we need to be more generic than > > > > > that. Maybe iommu groups should just treat it as a void* so iommus > > > > > can > > > > > use a pointer to some structure or a fixed value like a u16 bus:slot. > > > > > Thoughts? > > > > > > > > Good question. The iommu-groups are part of the IOMMU-API, with an > > > > interface to the IOMMU drivers and one to the users of IOMMU-API. So the > > > > alias handling itself should be a function of the interface to the IOMMU > > > > driver. In general the interface should not be bus specific. > > > > > > > > So a void pointer seems the only logical choice then. But I would not > > > > limit its scope to alias handling. How about making it a bus-private > > > > pointer where IOMMU driver store bus-specific information. That way we > > > > make sure that there is one struct per bus-type for this pointer, and > > > > not one structure per IOMMU driver. > > > > > > I thought of another approach that may actually be more 3.6 worthy. > > > What if we just make the iommu driver handle it? For instance, > > > amd_iommu can walk the alias table looking for entries that use the same > > > alias and get the device via pci_get_bus_and_slot. If it finds a device > > > with an iommu group, it attaches the new device to the same group, > > > hiding anything about aliases from the group layer. It just groups all > > > devices within the range. I think the only complication is making sure > > > we're safe around device hotplug while we're doing this. Thanks, > > > > I think this could work. Instead of searching for other devices, check > > for or allocate an iommu group on the alias dev_data, any "virtual" > > aliases use that iommu group. Florian, could you test this as well? > > Here's a lockdep clean version of it: > > amd_iommu: Handle aliases not backed by devices > [ skipped patch ] yes, this patch is working for me, too. I also tested your second patch, it was working as well. thanks, Florian -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6-rc7 boot crash + bisection
On Wed, 2012-09-26 at 13:50 -0600, Alex Williamson wrote: > On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote: > > On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote: > > > On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote: > > > > Hmm, that throws a kink in iommu groups. So perhaps we need to make an > > > > alias interface to iommu groups. Seems like this could just be an extra > > > > parameter to iommu_group_get and iommu_group_add_device (empty in the > > > > typical case). Then we have the problem of what's the type for an > > > > alias? For AMI-Vi, it's a u16, but we need to be more generic than > > > > that. Maybe iommu groups should just treat it as a void* so iommus can > > > > use a pointer to some structure or a fixed value like a u16 bus:slot. > > > > Thoughts? > > > > > > Good question. The iommu-groups are part of the IOMMU-API, with an > > > interface to the IOMMU drivers and one to the users of IOMMU-API. So the > > > alias handling itself should be a function of the interface to the IOMMU > > > driver. In general the interface should not be bus specific. > > > > > > So a void pointer seems the only logical choice then. But I would not > > > limit its scope to alias handling. How about making it a bus-private > > > pointer where IOMMU driver store bus-specific information. That way we > > > make sure that there is one struct per bus-type for this pointer, and > > > not one structure per IOMMU driver. > > > > I thought of another approach that may actually be more 3.6 worthy. > > What if we just make the iommu driver handle it? For instance, > > amd_iommu can walk the alias table looking for entries that use the same > > alias and get the device via pci_get_bus_and_slot. If it finds a device > > with an iommu group, it attaches the new device to the same group, > > hiding anything about aliases from the group layer. It just groups all > > devices within the range. I think the only complication is making sure > > we're safe around device hotplug while we're doing this. Thanks, > > I think this could work. Instead of searching for other devices, check > for or allocate an iommu group on the alias dev_data, any "virtual" > aliases use that iommu group. Florian, could you test this as well? Here's a lockdep clean version of it: amd_iommu: Handle aliases not backed by devices Aliases sometimes don't have a struct pci_dev backing them. This breaks our attempt to figure out the topology and device quirks that may effect IOMMU grouping. When this happens, allocate an IOMMU group on the dev_data for the alias and make use of it for all devices referencing this alias. Signed-off-by: Alex Williamson --- diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index b64502d..4eacb17 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -71,6 +71,7 @@ static DEFINE_SPINLOCK(iommu_pd_list_lock); /* List of all available dev_data structures */ static LIST_HEAD(dev_data_list); static DEFINE_SPINLOCK(dev_data_list_lock); +static DEFINE_MUTEX(dev_data_iommu_group_lock); /* * Domain for untranslated devices - only allocated @@ -128,6 +129,9 @@ static void free_dev_data(struct iommu_dev_data *dev_data) list_del(&dev_data->dev_data_list); spin_unlock_irqrestore(&dev_data_list_lock, flags); + if (dev_data->group) + iommu_group_put(dev_data->group); + kfree(dev_data); } @@ -256,6 +260,34 @@ static bool check_device(struct device *dev) return true; } +/* + * Sometimes there's no actual device for an alias. When that happens + * we allocate an iommu group on the dev_data and use it for anything + * aliasing back to this device. This makes sure that multiple devices + * aliased to a non-existent device id all get grouped together. Hold + * on to the reference for the group, it can be static rather than get + * automatically reclaimed if this device later gets removed. + */ +static int dev_data_add_iommu_group(struct iommu_dev_data *dev_data, + struct device *dev) +{ + mutex_lock(&dev_data_iommu_group_lock); + + if (!dev_data->group) { + struct iommu_group *group = iommu_group_alloc(); + if (IS_ERR(group)) { + mutex_unlock(&dev_data_iommu_group_lock); + return PTR_ERR(group); + } + + dev_data->group = group; + } + + mutex_unlock(&dev_data_iommu_group_lock); + + return iommu_group_add_device(dev_data->group, dev); +} + static void swap_pci_ref(struct pci_dev **from, struct pci_dev *to) { pci_dev_put(*from); @@ -264,38 +296,17 @@ static void swap_pci_ref(struct pci_dev **from, struct pci_dev *to) #define REQ_ACS_FLAGS (PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF) -static int iommu_init_device(struct device *dev) +/* + * Given a pci device, look at device quirks and topology between it + *
Re: 3.6-rc7 boot crash + bisection
On Wed, 2012-09-26 at 10:21 -0600, Alex Williamson wrote: > On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote: > > On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote: > > > Hmm, that throws a kink in iommu groups. So perhaps we need to make an > > > alias interface to iommu groups. Seems like this could just be an extra > > > parameter to iommu_group_get and iommu_group_add_device (empty in the > > > typical case). Then we have the problem of what's the type for an > > > alias? For AMI-Vi, it's a u16, but we need to be more generic than > > > that. Maybe iommu groups should just treat it as a void* so iommus can > > > use a pointer to some structure or a fixed value like a u16 bus:slot. > > > Thoughts? > > > > Good question. The iommu-groups are part of the IOMMU-API, with an > > interface to the IOMMU drivers and one to the users of IOMMU-API. So the > > alias handling itself should be a function of the interface to the IOMMU > > driver. In general the interface should not be bus specific. > > > > So a void pointer seems the only logical choice then. But I would not > > limit its scope to alias handling. How about making it a bus-private > > pointer where IOMMU driver store bus-specific information. That way we > > make sure that there is one struct per bus-type for this pointer, and > > not one structure per IOMMU driver. > > I thought of another approach that may actually be more 3.6 worthy. > What if we just make the iommu driver handle it? For instance, > amd_iommu can walk the alias table looking for entries that use the same > alias and get the device via pci_get_bus_and_slot. If it finds a device > with an iommu group, it attaches the new device to the same group, > hiding anything about aliases from the group layer. It just groups all > devices within the range. I think the only complication is making sure > we're safe around device hotplug while we're doing this. Thanks, I think this could work. Instead of searching for other devices, check for or allocate an iommu group on the alias dev_data, any "virtual" aliases use that iommu group. Florian, could you test this as well? Thanks, Alex Signed-off-by: Alex Williamson --- diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index b64502d..22879ed 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -126,6 +126,8 @@ static void free_dev_data(struct iommu_dev_data *dev_data) spin_lock_irqsave(&dev_data_list_lock, flags); list_del(&dev_data->dev_data_list); + if (dev_data->group) + iommu_group_put(dev_data->group); spin_unlock_irqrestore(&dev_data_list_lock, flags); kfree(dev_data); @@ -256,6 +258,37 @@ static bool check_device(struct device *dev) return true; } +/* + * Sometimes there's no actual device for an alias. When that happens + * we allocate an iommu group on the iommu_dev_data so that it gets used + * by anything with the same alias. We keep the reference from + * iommu_group_alloc so the group persists with the iommu_dev_data. + */ +static int dev_data_add_iommu_group(struct iommu_dev_data *dev_data, + struct device *dev) +{ + unsigned long flags; + struct iommu_group *group; + int ret = 0; + + spin_lock_irqsave(&dev_data_list_lock, flags); + if (!dev_data->group) { + group = iommu_group_alloc(); + if (IS_ERR(group)) { + ret = PTR_ERR(group); + goto unlock; + } + + dev_data->group = group; + } else + group = dev_data->group; + + ret = iommu_group_add_device(group, dev); +unlock: + spin_unlock_irqrestore(&dev_data_list_lock, flags); + return ret; +} + static void swap_pci_ref(struct pci_dev **from, struct pci_dev *to) { pci_dev_put(*from); @@ -264,38 +297,12 @@ static void swap_pci_ref(struct pci_dev **from, struct pci_dev *to) #define REQ_ACS_FLAGS (PCI_ACS_SV | PCI_ACS_RR | PCI_ACS_CR | PCI_ACS_UF) -static int iommu_init_device(struct device *dev) +static int pdev_add_iommu_group(struct pci_dev *pdev, struct device *dev) { - struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev); - struct iommu_dev_data *dev_data; + struct pci_dev *dma_pdev = pdev; struct iommu_group *group; - u16 alias; int ret; - if (dev->archdata.iommu) - return 0; - - dev_data = find_dev_data(get_device_id(dev)); - if (!dev_data) - return -ENOMEM; - - alias = amd_iommu_alias_table[dev_data->devid]; - if (alias != dev_data->devid) { - struct iommu_dev_data *alias_data; - - alias_data = find_dev_data(alias); - if (alias_data == NULL) { - pr_err("AMD-Vi: Warning: Unhandled device %s\n", - dev_name(dev)); -
Re: 3.6-rc7 boot crash + bisection
Am Wed, 26 Sep 2012 17:04:07 +0200 schrieb "Roedel, Joerg" : > On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote: > > Assuming this works, it may be ok as a 3.7 fix, but if there was > > actually more than one device behind the alias we'd expose them as > > separate iommu groups. I don't think that's what we want. Maybe it > > should at least get a pr_warn. Thanks, > > True, we need something more generic as the real fix. When Florian > reports success I'll try to get this still into 3.6, otherwise to > -stable. > > > Joerg > ... updating to the newest BIOS revision does not make any difference, rc7 is crashing, rc7+patch is not, I definitely need this patch. If there is more you want me to test, pls tell. Florian -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6-rc7 boot crash + bisection
Am Wed, 26 Sep 2012 17:04:07 +0200 schrieb "Roedel, Joerg" : > On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote: > > Assuming this works, it may be ok as a 3.7 fix, but if there was > > actually more than one device behind the alias we'd expose them as > > separate iommu groups. I don't think that's what we want. Maybe it > > should at least get a pr_warn. Thanks, > > True, we need something more generic as the real fix. When Florian > reports success I'll try to get this still into 3.6, otherwise to > -stable. > > > Joerg > it still fails with the card in a *different* slot. But with the patch applied, everything works, so this fixes it for me! Thanks a lot. Output of the relevant parts of dmesg and lspci see below. I'll still try the newest BIOS rev. and report back. thx, Florian dmesg (kernel-3.5.4): [0.448252] RPC: Registered tcp NFSv4.1 backchannel transport module. [1.471021] pci :01:00.0: Boot video device [1.471118] PCI: CLS 64 bytes, default 64 [1.473864] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300 [1.473902] AMD-Vi:mmio-addr: feb2 [1.474119] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:00.0 flags: 00 [1.474153] AMD-Vi: DEV_RANGE_END devid: 00:00.2 [1.474187] AMD-Vi: DEV_SELECT devid: 00:02.0 flags: 00 [1.474220] AMD-Vi: DEV_SELECT_RANGE_START devid: 01:00.0 flags: 00 [1.474254] AMD-Vi: DEV_RANGE_END devid: 01:00.1 [1.474287] AMD-Vi: DEV_SELECT devid: 00:04.0 flags: 00 [1.474321] AMD-Vi: DEV_SELECT devid: 02:00.0 flags: 00 [1.474354] AMD-Vi: DEV_SELECT devid: 00:05.0 flags: 00 [1.474388] AMD-Vi: DEV_SELECT devid: 03:00.0 flags: 00 [1.474421] AMD-Vi: DEV_SELECT devid: 00:06.0 flags: 00 [1.474455] AMD-Vi: DEV_SELECT devid: 04:00.0 flags: 00 [1.474488] AMD-Vi: DEV_SELECT devid: 00:07.0 flags: 00 [1.474522] AMD-Vi: DEV_SELECT devid: 05:00.0 flags: 00 [1.474555] AMD-Vi: DEV_SELECT devid: 00:09.0 flags: 00 [1.474589] AMD-Vi: DEV_SELECT devid: 06:00.0 flags: 00 [1.474622] AMD-Vi: DEV_SELECT devid: 00:0d.0 flags: 00 [1.474656] AMD-Vi: DEV_SELECT devid: 07:00.0 flags: 00 [1.474689] AMD-Vi: DEV_ALIAS_RANGE devid: 08:01.0 flags: 00 devid_to: 08:00.0 [1.474726] AMD-Vi: DEV_RANGE_END devid: 08:1f.7 [1.474764] AMD-Vi: DEV_SELECT devid: 00:11.0 flags: 00 [1.474798] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:12.0 flags: 00 [1.474836] AMD-Vi: DEV_RANGE_END devid: 00:12.2 [1.474870] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:13.0 flags: 00 [1.474903] AMD-Vi: DEV_RANGE_END devid: 00:13.2 [1.474937] AMD-Vi: DEV_SELECT devid: 00:14.0 flags: d7 [1.474970] AMD-Vi: DEV_SELECT devid: 00:14.3 flags: 00 [1.475004] AMD-Vi: DEV_SELECT devid: 00:14.4 flags: 00 [1.475038] AMD-Vi: DEV_ALIAS_RANGE devid: 09:00.0 flags: 00 devid_to: 00:14.4 [1.475074] AMD-Vi: DEV_RANGE_END devid: 09:1f.7 [1.475112] AMD-Vi: DEV_SELECT devid: 00:14.5 flags: 00 [1.475146] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:16.0 flags: 00 [1.475180] AMD-Vi: DEV_RANGE_END devid: 00:16.2 [1.475271] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40 [1.529007] pci :00:00.2: irq 72 for MSI/MSI-X [1.539126] AMD-Vi: Lazy IO/TLB flushing enabled [1.539750] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) [1.539787] software IO TLB [mem 0xc9728000-0xcd727fff] (64MB) mapped at [8800c9728000-8800cd727fff] [1.539957] kvm: Nested Virtualization enabled lspci (kernel-3.5.4): 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (external gfx0 port B) (rev 02) Subsystem: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (external gfx0 port B) Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- Capabilities: [54] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: fee0f00c Data: 41a9 Capabilities: [64] HyperTransport: MSI Mapping Enable+ Fixed+ 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI bridge (PCI express gpp port B) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B-
Re: 3.6-rc7 boot crash + bisection
On Wed, 2012-09-26 at 17:10 +0200, Roedel, Joerg wrote: > On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote: > > Hmm, that throws a kink in iommu groups. So perhaps we need to make an > > alias interface to iommu groups. Seems like this could just be an extra > > parameter to iommu_group_get and iommu_group_add_device (empty in the > > typical case). Then we have the problem of what's the type for an > > alias? For AMI-Vi, it's a u16, but we need to be more generic than > > that. Maybe iommu groups should just treat it as a void* so iommus can > > use a pointer to some structure or a fixed value like a u16 bus:slot. > > Thoughts? > > Good question. The iommu-groups are part of the IOMMU-API, with an > interface to the IOMMU drivers and one to the users of IOMMU-API. So the > alias handling itself should be a function of the interface to the IOMMU > driver. In general the interface should not be bus specific. > > So a void pointer seems the only logical choice then. But I would not > limit its scope to alias handling. How about making it a bus-private > pointer where IOMMU driver store bus-specific information. That way we > make sure that there is one struct per bus-type for this pointer, and > not one structure per IOMMU driver. I thought of another approach that may actually be more 3.6 worthy. What if we just make the iommu driver handle it? For instance, amd_iommu can walk the alias table looking for entries that use the same alias and get the device via pci_get_bus_and_slot. If it finds a device with an iommu group, it attaches the new device to the same group, hiding anything about aliases from the group layer. It just groups all devices within the range. I think the only complication is making sure we're safe around device hotplug while we're doing this. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6-rc7 boot crash + bisection
On Wed, 2012-09-26 at 17:04 +0200, Roedel, Joerg wrote: > On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote: > > Assuming this works, it may be ok as a 3.7 fix, but if there was > > actually more than one device behind the alias we'd expose them as > > separate iommu groups. I don't think that's what we want. Maybe it > > should at least get a pr_warn. Thanks, > > True, we need something more generic as the real fix. When Florian > reports success I'll try to get this still into 3.6, otherwise to > -stable. Yes, 3.6 is what I meant to type. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6-rc7 boot crash + bisection
On Wed, Sep 26, 2012 at 08:35:59AM -0600, Alex Williamson wrote: > Hmm, that throws a kink in iommu groups. So perhaps we need to make an > alias interface to iommu groups. Seems like this could just be an extra > parameter to iommu_group_get and iommu_group_add_device (empty in the > typical case). Then we have the problem of what's the type for an > alias? For AMI-Vi, it's a u16, but we need to be more generic than > that. Maybe iommu groups should just treat it as a void* so iommus can > use a pointer to some structure or a fixed value like a u16 bus:slot. > Thoughts? Good question. The iommu-groups are part of the IOMMU-API, with an interface to the IOMMU drivers and one to the users of IOMMU-API. So the alias handling itself should be a function of the interface to the IOMMU driver. In general the interface should not be bus specific. So a void pointer seems the only logical choice then. But I would not limit its scope to alias handling. How about making it a bus-private pointer where IOMMU driver store bus-specific information. That way we make sure that there is one struct per bus-type for this pointer, and not one structure per IOMMU driver. Joerg -- AMD Operating System Research Center Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach General Managers: Alberto Bozzo Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6-rc7 boot crash + bisection
On Wed, Sep 26, 2012 at 08:52:01AM -0600, Alex Williamson wrote: > Assuming this works, it may be ok as a 3.7 fix, but if there was > actually more than one device behind the alias we'd expose them as > separate iommu groups. I don't think that's what we want. Maybe it > should at least get a pr_warn. Thanks, True, we need something more generic as the real fix. When Florian reports success I'll try to get this still into 3.6, otherwise to -stable. Joerg -- AMD Operating System Research Center Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach General Managers: Alberto Bozzo Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6-rc7 boot crash + bisection
On Wed, 2012-09-26 at 16:43 +0200, Roedel, Joerg wrote: > Florian, > > On Wed, Sep 26, 2012 at 01:01:54AM +0200, Florian Dazinger wrote: > > you're right, either "amd_iommu=off" or removing the audio card makes > > the failure disappear. I will test the new BIOS rev. tomorrow. > > Can you please test this diff and report if it fixes the problem for > you? > > Thanks. > > diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c > index b64502d..e89daf1 100644 > --- a/drivers/iommu/amd_iommu.c > +++ b/drivers/iommu/amd_iommu.c > @@ -266,7 +266,7 @@ static void swap_pci_ref(struct pci_dev **from, struct > pci_dev *to) > > static int iommu_init_device(struct device *dev) > { > - struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev); > + struct pci_dev *dma_pdev = NULL, *pdev = to_pci_dev(dev); > struct iommu_dev_data *dev_data; > struct iommu_group *group; > u16 alias; > @@ -293,7 +293,9 @@ static int iommu_init_device(struct device *dev) > dev_data->alias_data = alias_data; > > dma_pdev = pci_get_bus_and_slot(alias >> 8, alias & 0xff); > - } else > + } > + > + if (dma_pdev == NULL) > dma_pdev = pci_dev_get(pdev); > > /* Account for quirked devices */ > Assuming this works, it may be ok as a 3.7 fix, but if there was actually more than one device behind the alias we'd expose them as separate iommu groups. I don't think that's what we want. Maybe it should at least get a pr_warn. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6-rc7 boot crash + bisection
Florian, On Wed, Sep 26, 2012 at 01:01:54AM +0200, Florian Dazinger wrote: > you're right, either "amd_iommu=off" or removing the audio card makes > the failure disappear. I will test the new BIOS rev. tomorrow. Can you please test this diff and report if it fixes the problem for you? Thanks. diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index b64502d..e89daf1 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -266,7 +266,7 @@ static void swap_pci_ref(struct pci_dev **from, struct pci_dev *to) static int iommu_init_device(struct device *dev) { - struct pci_dev *dma_pdev, *pdev = to_pci_dev(dev); + struct pci_dev *dma_pdev = NULL, *pdev = to_pci_dev(dev); struct iommu_dev_data *dev_data; struct iommu_group *group; u16 alias; @@ -293,7 +293,9 @@ static int iommu_init_device(struct device *dev) dev_data->alias_data = alias_data; dma_pdev = pci_get_bus_and_slot(alias >> 8, alias & 0xff); - } else + } + + if (dma_pdev == NULL) dma_pdev = pci_dev_get(pdev); /* Account for quirked devices */ -- AMD Operating System Research Center Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach General Managers: Alberto Bozzo Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6-rc7 boot crash + bisection
On Wed, 2012-09-26 at 15:20 +0200, Roedel, Joerg wrote: > On Tue, Sep 25, 2012 at 01:43:46PM -0600, Alex Williamson wrote: > > Joerg, any thoughts on a quirk for this? Unfortunately we can't just > > skip IOMMU groups when an alias is broken because it puts the other > > IOMMU groups at risk that might not actually be isolated from this > > device. It looks like we parse the alias info before PCI is probed, so > > maybe we'd need to call the quirk from iommu_init_device itself. > > I fear that the BIOS does everything right and device 08:04.0 is indeed > using 08:00.0 as request-id. There are a couple of devices where this > happens, usually when the vendor just took the old 32bit PCI chip, added > a transparent PCIe-to-PCI bridge to the device and sell it a PCIe. > > So the assumption that every request-id has a corresponding pci_dev > structure does not hold. I also had made that assumption in the > AMD IOMMU driver but had to add code which removes that assumption. We > should look for a way to remove that assumption from the group-code too. Hmm, that throws a kink in iommu groups. So perhaps we need to make an alias interface to iommu groups. Seems like this could just be an extra parameter to iommu_group_get and iommu_group_add_device (empty in the typical case). Then we have the problem of what's the type for an alias? For AMI-Vi, it's a u16, but we need to be more generic than that. Maybe iommu groups should just treat it as a void* so iommus can use a pointer to some structure or a fixed value like a u16 bus:slot. Thoughts? Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6-rc7 boot crash + bisection
On Tue, Sep 25, 2012 at 01:43:46PM -0600, Alex Williamson wrote: > Joerg, any thoughts on a quirk for this? Unfortunately we can't just > skip IOMMU groups when an alias is broken because it puts the other > IOMMU groups at risk that might not actually be isolated from this > device. It looks like we parse the alias info before PCI is probed, so > maybe we'd need to call the quirk from iommu_init_device itself. I fear that the BIOS does everything right and device 08:04.0 is indeed using 08:00.0 as request-id. There are a couple of devices where this happens, usually when the vendor just took the old 32bit PCI chip, added a transparent PCIe-to-PCI bridge to the device and sell it a PCIe. So the assumption that every request-id has a corresponding pci_dev structure does not hold. I also had made that assumption in the AMD IOMMU driver but had to add code which removes that assumption. We should look for a way to remove that assumption from the group-code too. Joerg -- AMD Operating System Research Center Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach General Managers: Alberto Bozzo Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 3.6-rc7 boot crash + bisection
On Wed, 2012-09-26 at 01:01 +0200, Florian Dazinger wrote: > Am Tue, 25 Sep 2012 13:43:46 -0600 > schrieb Alex Williamson : > > > On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote: > > > Am Tue, 25 Sep 2012 12:32:50 -0600 > > > schrieb Alex Williamson : > > > > > > > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote: > > > > > Hi, > > > > > I think I've found a regression, which causes an early boot crash, I > > > > > appended the kernel output via jpg file, since I do not have a serial > > > > > console or sth. > > > > > > > > > > after bisection, it boils down to this commit: > > > > > > > > > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit > > > > > commit 9dcd61303af862c279df86aa97fde7ce371be774 > > > > > Author: Alex Williamson > > > > > Date: Wed May 30 14:19:07 2012 -0600 > > > > > > > > > > amd_iommu: Support IOMMU groups > > > > > > > > > > Add IOMMU group support to AMD-Vi device init and uninit code. > > > > > Existing notifiers make sure this gets called for each device. > > > > > > > > > > Signed-off-by: Alex Williamson > > > > > Signed-off-by: Joerg Roedel > > > > > > > > > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060 > > > > > 837ae95e84f6d3553457c4df595a9caa56843c03 M drivers > > > > > > > > [switching back to mailing list thread] > > > > > > > > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines: > > > > > > > > [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info > > > > 1300 > > > > [1.485683] AMD-Vi:mmio-addr: feb2 > > > > [1.485901] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:00.0 flags: > > > > 00 > > > > [1.485935] AMD-Vi: DEV_RANGE_END devid: 00:00.2 > > > > [1.485969] AMD-Vi: DEV_SELECT devid: 00:02.0 > > > > flags: 00 > > > > [1.486002] AMD-Vi: DEV_SELECT_RANGE_START devid: 01:00.0 flags: > > > > 00 > > > > [1.486036] AMD-Vi: DEV_RANGE_END devid: 01:00.1 > > > > [1.486070] AMD-Vi: DEV_SELECT devid: 00:04.0 > > > > flags: 00 > > > > [1.486103] AMD-Vi: DEV_SELECT devid: 02:00.0 > > > > flags: 00 > > > > [1.486137] AMD-Vi: DEV_SELECT devid: 00:05.0 > > > > flags: 00 > > > > [1.486170] AMD-Vi: DEV_SELECT devid: 03:00.0 > > > > flags: 00 > > > > [1.486204] AMD-Vi: DEV_SELECT devid: 00:06.0 > > > > flags: 00 > > > > [1.486238] AMD-Vi: DEV_SELECT devid: 04:00.0 > > > > flags: 00 > > > > [1.486271] AMD-Vi: DEV_SELECT devid: 00:07.0 > > > > flags: 00 > > > > [1.486305] AMD-Vi: DEV_SELECT devid: 05:00.0 > > > > flags: 00 > > > > [1.486338] AMD-Vi: DEV_SELECT devid: 00:09.0 > > > > flags: 00 > > > > [1.486372] AMD-Vi: DEV_SELECT devid: 06:00.0 > > > > flags: 00 > > > > [1.486406] AMD-Vi: DEV_SELECT devid: 00:0b.0 > > > > flags: 00 > > > > [1.486439] AMD-Vi: DEV_SELECT devid: 07:00.0 > > > > flags: 00 > > > > [1.486473] AMD-Vi: DEV_ALIAS_RANGE devid: 08:01.0 > > > > flags: 00 devid_to: 08:00.0 > > > > [1.486510] AMD-Vi: DEV_RANGE_END devid: 08:1f.7 > > > > [1.486548] AMD-Vi: DEV_SELECT devid: 00:11.0 > > > > flags: 00 > > > > [1.486581] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:12.0 flags: > > > > 00 > > > > [1.486620] AMD-Vi: DEV_RANGE_END devid: 00:12.2 > > > > [1.486654] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:13.0 flags: > > > > 00 > > > > [1.486688] AMD-Vi: DEV_RANGE_END devid: 00:13.2 > > > > [1.486721] AMD-Vi: DEV_SELECT devid: 00:14.0 > > > > flags: d7 > > > > [1.486755] AMD-Vi: DEV_SELECT devid: 00:14.3 > > > > flags: 00 > > > > [1.486788] AMD-Vi: DEV_SELECT devid: 00:14.4 > > > > flags: 00 > > > > [1.486822] AMD-Vi: DEV_ALIAS_RANGE devid: 09:00.0 > > > > flags: 00 devid_to: 00:14.4 > > > > [1.486859] AMD-Vi: DEV_RANGE_END devid: 09:1f.7 > > > > [1.486897] AMD-Vi: DEV_SELECT devid: 00:14.5 > > > > flags: 00 > > > > [1.486931] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:16.0 flags: > > > > 00 > > > > [1.486965] AMD-Vi: DEV_RANGE_END devid: 00:16.2 > > > > [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40 > > > > > > > > > > > > > lspci: > > > > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI > > > > > to PCI bridge (external gfx0 port B) (rev 02) > > > > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory > > > > > Management Unit (IOMMU) > > > > > 00:02.0 PCI bridge: Advanced Micro Devic
Re: 3.6-rc7 boot crash + bisection
Am Tue, 25 Sep 2012 13:43:46 -0600 schrieb Alex Williamson : > On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote: > > Am Tue, 25 Sep 2012 12:32:50 -0600 > > schrieb Alex Williamson : > > > > > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote: > > > > Hi, > > > > I think I've found a regression, which causes an early boot crash, I > > > > appended the kernel output via jpg file, since I do not have a serial > > > > console or sth. > > > > > > > > after bisection, it boils down to this commit: > > > > > > > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit > > > > commit 9dcd61303af862c279df86aa97fde7ce371be774 > > > > Author: Alex Williamson > > > > Date: Wed May 30 14:19:07 2012 -0600 > > > > > > > > amd_iommu: Support IOMMU groups > > > > > > > > Add IOMMU group support to AMD-Vi device init and uninit code. > > > > Existing notifiers make sure this gets called for each device. > > > > > > > > Signed-off-by: Alex Williamson > > > > Signed-off-by: Joerg Roedel > > > > > > > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060 > > > > 837ae95e84f6d3553457c4df595a9caa56843c03 M drivers > > > > > > [switching back to mailing list thread] > > > > > > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines: > > > > > > [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info > > > 1300 > > > [1.485683] AMD-Vi:mmio-addr: feb2 > > > [1.485901] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:00.0 flags: 00 > > > [1.485935] AMD-Vi: DEV_RANGE_END devid: 00:00.2 > > > [1.485969] AMD-Vi: DEV_SELECT devid: 00:02.0 > > > flags: 00 > > > [1.486002] AMD-Vi: DEV_SELECT_RANGE_START devid: 01:00.0 flags: 00 > > > [1.486036] AMD-Vi: DEV_RANGE_END devid: 01:00.1 > > > [1.486070] AMD-Vi: DEV_SELECT devid: 00:04.0 > > > flags: 00 > > > [1.486103] AMD-Vi: DEV_SELECT devid: 02:00.0 > > > flags: 00 > > > [1.486137] AMD-Vi: DEV_SELECT devid: 00:05.0 > > > flags: 00 > > > [1.486170] AMD-Vi: DEV_SELECT devid: 03:00.0 > > > flags: 00 > > > [1.486204] AMD-Vi: DEV_SELECT devid: 00:06.0 > > > flags: 00 > > > [1.486238] AMD-Vi: DEV_SELECT devid: 04:00.0 > > > flags: 00 > > > [1.486271] AMD-Vi: DEV_SELECT devid: 00:07.0 > > > flags: 00 > > > [1.486305] AMD-Vi: DEV_SELECT devid: 05:00.0 > > > flags: 00 > > > [1.486338] AMD-Vi: DEV_SELECT devid: 00:09.0 > > > flags: 00 > > > [1.486372] AMD-Vi: DEV_SELECT devid: 06:00.0 > > > flags: 00 > > > [1.486406] AMD-Vi: DEV_SELECT devid: 00:0b.0 > > > flags: 00 > > > [1.486439] AMD-Vi: DEV_SELECT devid: 07:00.0 > > > flags: 00 > > > [1.486473] AMD-Vi: DEV_ALIAS_RANGE devid: 08:01.0 > > > flags: 00 devid_to: 08:00.0 > > > [1.486510] AMD-Vi: DEV_RANGE_END devid: 08:1f.7 > > > [1.486548] AMD-Vi: DEV_SELECT devid: 00:11.0 > > > flags: 00 > > > [1.486581] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:12.0 flags: 00 > > > [1.486620] AMD-Vi: DEV_RANGE_END devid: 00:12.2 > > > [1.486654] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:13.0 flags: 00 > > > [1.486688] AMD-Vi: DEV_RANGE_END devid: 00:13.2 > > > [1.486721] AMD-Vi: DEV_SELECT devid: 00:14.0 > > > flags: d7 > > > [1.486755] AMD-Vi: DEV_SELECT devid: 00:14.3 > > > flags: 00 > > > [1.486788] AMD-Vi: DEV_SELECT devid: 00:14.4 > > > flags: 00 > > > [1.486822] AMD-Vi: DEV_ALIAS_RANGE devid: 09:00.0 > > > flags: 00 devid_to: 00:14.4 > > > [1.486859] AMD-Vi: DEV_RANGE_END devid: 09:1f.7 > > > [1.486897] AMD-Vi: DEV_SELECT devid: 00:14.5 > > > flags: 00 > > > [1.486931] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:16.0 flags: 00 > > > [1.486965] AMD-Vi: DEV_RANGE_END devid: 00:16.2 > > > [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40 > > > > > > > > > > lspci: > > > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to > > > > PCI bridge (external gfx0 port B) (rev 02) > > > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory > > > > Management Unit (IOMMU) > > > > 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to > > > > PCI bridge (PCI express gpp port B) > > > > 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to > > > > PCI bridge (PCI express gpp port D) > > > > 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to > > > > PCI bridge (PCI e
Re: 3.6-rc7 boot crash + bisection
On Tue, 2012-09-25 at 20:54 +0200, Florian Dazinger wrote: > Am Tue, 25 Sep 2012 12:32:50 -0600 > schrieb Alex Williamson : > > > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote: > > > Hi, > > > I think I've found a regression, which causes an early boot crash, I > > > appended the kernel output via jpg file, since I do not have a serial > > > console or sth. > > > > > > after bisection, it boils down to this commit: > > > > > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit > > > commit 9dcd61303af862c279df86aa97fde7ce371be774 > > > Author: Alex Williamson > > > Date: Wed May 30 14:19:07 2012 -0600 > > > > > > amd_iommu: Support IOMMU groups > > > > > > Add IOMMU group support to AMD-Vi device init and uninit code. > > > Existing notifiers make sure this gets called for each device. > > > > > > Signed-off-by: Alex Williamson > > > Signed-off-by: Joerg Roedel > > > > > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060 > > > 837ae95e84f6d3553457c4df595a9caa56843c03 M drivers > > > > [switching back to mailing list thread] > > > > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines: > > > > [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300 > > [1.485683] AMD-Vi:mmio-addr: feb2 > > [1.485901] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:00.0 flags: 00 > > [1.485935] AMD-Vi: DEV_RANGE_END devid: 00:00.2 > > [1.485969] AMD-Vi: DEV_SELECT devid: 00:02.0 > > flags: 00 > > [1.486002] AMD-Vi: DEV_SELECT_RANGE_START devid: 01:00.0 flags: 00 > > [1.486036] AMD-Vi: DEV_RANGE_END devid: 01:00.1 > > [1.486070] AMD-Vi: DEV_SELECT devid: 00:04.0 > > flags: 00 > > [1.486103] AMD-Vi: DEV_SELECT devid: 02:00.0 > > flags: 00 > > [1.486137] AMD-Vi: DEV_SELECT devid: 00:05.0 > > flags: 00 > > [1.486170] AMD-Vi: DEV_SELECT devid: 03:00.0 > > flags: 00 > > [1.486204] AMD-Vi: DEV_SELECT devid: 00:06.0 > > flags: 00 > > [1.486238] AMD-Vi: DEV_SELECT devid: 04:00.0 > > flags: 00 > > [1.486271] AMD-Vi: DEV_SELECT devid: 00:07.0 > > flags: 00 > > [1.486305] AMD-Vi: DEV_SELECT devid: 05:00.0 > > flags: 00 > > [1.486338] AMD-Vi: DEV_SELECT devid: 00:09.0 > > flags: 00 > > [1.486372] AMD-Vi: DEV_SELECT devid: 06:00.0 > > flags: 00 > > [1.486406] AMD-Vi: DEV_SELECT devid: 00:0b.0 > > flags: 00 > > [1.486439] AMD-Vi: DEV_SELECT devid: 07:00.0 > > flags: 00 > > [1.486473] AMD-Vi: DEV_ALIAS_RANGE devid: 08:01.0 > > flags: 00 devid_to: 08:00.0 > > [1.486510] AMD-Vi: DEV_RANGE_END devid: 08:1f.7 > > [1.486548] AMD-Vi: DEV_SELECT devid: 00:11.0 > > flags: 00 > > [1.486581] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:12.0 flags: 00 > > [1.486620] AMD-Vi: DEV_RANGE_END devid: 00:12.2 > > [1.486654] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:13.0 flags: 00 > > [1.486688] AMD-Vi: DEV_RANGE_END devid: 00:13.2 > > [1.486721] AMD-Vi: DEV_SELECT devid: 00:14.0 > > flags: d7 > > [1.486755] AMD-Vi: DEV_SELECT devid: 00:14.3 > > flags: 00 > > [1.486788] AMD-Vi: DEV_SELECT devid: 00:14.4 > > flags: 00 > > [1.486822] AMD-Vi: DEV_ALIAS_RANGE devid: 09:00.0 > > flags: 00 devid_to: 00:14.4 > > [1.486859] AMD-Vi: DEV_RANGE_END devid: 09:1f.7 > > [1.486897] AMD-Vi: DEV_SELECT devid: 00:14.5 > > flags: 00 > > [1.486931] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:16.0 flags: 00 > > [1.486965] AMD-Vi: DEV_RANGE_END devid: 00:16.2 > > [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40 > > > > > > > lspci: > > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to > > > PCI bridge (external gfx0 port B) (rev 02) > > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory > > > Management Unit (IOMMU) > > > 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > > bridge (PCI express gpp port B) > > > 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > > bridge (PCI express gpp port D) > > > 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > > bridge (PCI express gpp port E) > > > 00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > > bridge (PCI express gpp port F) > > > 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > > bridge (PCI express gpp port G) > > > 00:09.
Re: 3.6-rc7 boot crash + bisection
On Tue, 2012-09-25 at 12:32 -0600, Alex Williamson wrote: > On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote: > > Hi, > > I think I've found a regression, which causes an early boot crash, I > > appended the kernel output via jpg file, since I do not have a serial > > console or sth. > > > > after bisection, it boils down to this commit: > > > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit > > commit 9dcd61303af862c279df86aa97fde7ce371be774 > > Author: Alex Williamson > > Date: Wed May 30 14:19:07 2012 -0600 > > > > amd_iommu: Support IOMMU groups > > > > Add IOMMU group support to AMD-Vi device init and uninit code. > > Existing notifiers make sure this gets called for each device. > > > > Signed-off-by: Alex Williamson > > Signed-off-by: Joerg Roedel > > > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060 > > 837ae95e84f6d3553457c4df595a9caa56843c03 M drivers > > [switching back to mailing list thread] > > I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines: > > [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300 > [1.485683] AMD-Vi:mmio-addr: feb2 > [1.485901] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:00.0 flags: 00 > [1.485935] AMD-Vi: DEV_RANGE_END devid: 00:00.2 > [1.485969] AMD-Vi: DEV_SELECT devid: 00:02.0 > flags: 00 > [1.486002] AMD-Vi: DEV_SELECT_RANGE_START devid: 01:00.0 flags: 00 > [1.486036] AMD-Vi: DEV_RANGE_END devid: 01:00.1 > [1.486070] AMD-Vi: DEV_SELECT devid: 00:04.0 > flags: 00 > [1.486103] AMD-Vi: DEV_SELECT devid: 02:00.0 > flags: 00 > [1.486137] AMD-Vi: DEV_SELECT devid: 00:05.0 > flags: 00 > [1.486170] AMD-Vi: DEV_SELECT devid: 03:00.0 > flags: 00 > [1.486204] AMD-Vi: DEV_SELECT devid: 00:06.0 > flags: 00 > [1.486238] AMD-Vi: DEV_SELECT devid: 04:00.0 > flags: 00 > [1.486271] AMD-Vi: DEV_SELECT devid: 00:07.0 > flags: 00 > [1.486305] AMD-Vi: DEV_SELECT devid: 05:00.0 > flags: 00 > [1.486338] AMD-Vi: DEV_SELECT devid: 00:09.0 > flags: 00 > [1.486372] AMD-Vi: DEV_SELECT devid: 06:00.0 > flags: 00 > [1.486406] AMD-Vi: DEV_SELECT devid: 00:0b.0 > flags: 00 > [1.486439] AMD-Vi: DEV_SELECT devid: 07:00.0 > flags: 00 > [1.486473] AMD-Vi: DEV_ALIAS_RANGE devid: 08:01.0 > flags: 00 devid_to: 08:00.0 > [1.486510] AMD-Vi: DEV_RANGE_END devid: 08:1f.7 > [1.486548] AMD-Vi: DEV_SELECT devid: 00:11.0 > flags: 00 > [1.486581] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:12.0 flags: 00 > [1.486620] AMD-Vi: DEV_RANGE_END devid: 00:12.2 > [1.486654] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:13.0 flags: 00 > [1.486688] AMD-Vi: DEV_RANGE_END devid: 00:13.2 > [1.486721] AMD-Vi: DEV_SELECT devid: 00:14.0 > flags: d7 > [1.486755] AMD-Vi: DEV_SELECT devid: 00:14.3 > flags: 00 > [1.486788] AMD-Vi: DEV_SELECT devid: 00:14.4 > flags: 00 > [1.486822] AMD-Vi: DEV_ALIAS_RANGE devid: 09:00.0 > flags: 00 devid_to: 00:14.4 > [1.486859] AMD-Vi: DEV_RANGE_END devid: 09:1f.7 > [1.486897] AMD-Vi: DEV_SELECT devid: 00:14.5 > flags: 00 > [1.486931] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:16.0 flags: 00 > [1.486965] AMD-Vi: DEV_RANGE_END devid: 00:16.2 > [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40 > > > > lspci: > > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > bridge (external gfx0 port B) (rev 02) > > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory > > Management Unit (IOMMU) > > 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > bridge (PCI express gpp port B) > > 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > bridge (PCI express gpp port D) > > 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > bridge (PCI express gpp port E) > > 00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > bridge (PCI express gpp port F) > > 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > bridge (PCI express gpp port G) > > 00:09.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > bridge (PCI express gpp port H) > > 00:0b.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > > bridge (NB-SB link) > > 00:11.0 SATA controller: Advanced Micro Devices [AMD] nee A
Re: 3.6-rc7 boot crash + bisection
On Mon, 2012-09-24 at 21:03 +0200, Florian Dazinger wrote: > Hi, > I think I've found a regression, which causes an early boot crash, I > appended the kernel output via jpg file, since I do not have a serial > console or sth. > > after bisection, it boils down to this commit: > > 9dcd61303af862c279df86aa97fde7ce371be774 is the first bad commit > commit 9dcd61303af862c279df86aa97fde7ce371be774 > Author: Alex Williamson > Date: Wed May 30 14:19:07 2012 -0600 > > amd_iommu: Support IOMMU groups > > Add IOMMU group support to AMD-Vi device init and uninit code. > Existing notifiers make sure this gets called for each device. > > Signed-off-by: Alex Williamson > Signed-off-by: Joerg Roedel > > :04 04 2f6b1b8e104d6dfec0abaa9646750f9b5a4f4060 > 837ae95e84f6d3553457c4df595a9caa56843c03 M drivers [switching back to mailing list thread] I asked Florian for dmesg w/ amd_iommu_dump, here's the relevant lines: [1.485645] AMD-Vi: device: 00:00.2 cap: 0040 seg: 0 flags: 3e info 1300 [1.485683] AMD-Vi:mmio-addr: feb2 [1.485901] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:00.0 flags: 00 [1.485935] AMD-Vi: DEV_RANGE_END devid: 00:00.2 [1.485969] AMD-Vi: DEV_SELECT devid: 00:02.0 flags: 00 [1.486002] AMD-Vi: DEV_SELECT_RANGE_START devid: 01:00.0 flags: 00 [1.486036] AMD-Vi: DEV_RANGE_END devid: 01:00.1 [1.486070] AMD-Vi: DEV_SELECT devid: 00:04.0 flags: 00 [1.486103] AMD-Vi: DEV_SELECT devid: 02:00.0 flags: 00 [1.486137] AMD-Vi: DEV_SELECT devid: 00:05.0 flags: 00 [1.486170] AMD-Vi: DEV_SELECT devid: 03:00.0 flags: 00 [1.486204] AMD-Vi: DEV_SELECT devid: 00:06.0 flags: 00 [1.486238] AMD-Vi: DEV_SELECT devid: 04:00.0 flags: 00 [1.486271] AMD-Vi: DEV_SELECT devid: 00:07.0 flags: 00 [1.486305] AMD-Vi: DEV_SELECT devid: 05:00.0 flags: 00 [1.486338] AMD-Vi: DEV_SELECT devid: 00:09.0 flags: 00 [1.486372] AMD-Vi: DEV_SELECT devid: 06:00.0 flags: 00 [1.486406] AMD-Vi: DEV_SELECT devid: 00:0b.0 flags: 00 [1.486439] AMD-Vi: DEV_SELECT devid: 07:00.0 flags: 00 [1.486473] AMD-Vi: DEV_ALIAS_RANGE devid: 08:01.0 flags: 00 devid_to: 08:00.0 [1.486510] AMD-Vi: DEV_RANGE_END devid: 08:1f.7 [1.486548] AMD-Vi: DEV_SELECT devid: 00:11.0 flags: 00 [1.486581] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:12.0 flags: 00 [1.486620] AMD-Vi: DEV_RANGE_END devid: 00:12.2 [1.486654] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:13.0 flags: 00 [1.486688] AMD-Vi: DEV_RANGE_END devid: 00:13.2 [1.486721] AMD-Vi: DEV_SELECT devid: 00:14.0 flags: d7 [1.486755] AMD-Vi: DEV_SELECT devid: 00:14.3 flags: 00 [1.486788] AMD-Vi: DEV_SELECT devid: 00:14.4 flags: 00 [1.486822] AMD-Vi: DEV_ALIAS_RANGE devid: 09:00.0 flags: 00 devid_to: 00:14.4 [1.486859] AMD-Vi: DEV_RANGE_END devid: 09:1f.7 [1.486897] AMD-Vi: DEV_SELECT devid: 00:14.5 flags: 00 [1.486931] AMD-Vi: DEV_SELECT_RANGE_START devid: 00:16.0 flags: 00 [1.486965] AMD-Vi: DEV_RANGE_END devid: 00:16.2 [1.487055] AMD-Vi: Enabling IOMMU at :00:00.2 cap 0x40 > lspci: > 00:00.0 Host bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > bridge (external gfx0 port B) (rev 02) > 00:00.2 IOMMU: Advanced Micro Devices [AMD] nee ATI RD990 I/O Memory > Management Unit (IOMMU) > 00:02.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > bridge (PCI express gpp port B) > 00:04.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > bridge (PCI express gpp port D) > 00:05.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > bridge (PCI express gpp port E) > 00:06.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > bridge (PCI express gpp port F) > 00:07.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > bridge (PCI express gpp port G) > 00:09.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > bridge (PCI express gpp port H) > 00:0b.0 PCI bridge: Advanced Micro Devices [AMD] nee ATI RD890 PCI to PCI > bridge (NB-SB link) > 00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI > SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40) > 00:12.0 USB controller: Advanced Micro Devices [AMD] nee ATI > SB7x0/SB8x0/SB9x0 USB OHCI0 Controller > 00:12.2 USB controller: Advanced Micro Devices [AMD] nee ATI > SB7x0/SB8x0/SB9x0 USB EHCI Controlle