Bug#1015871: Enabling PCI_P2PDMA for distro kernels?
On Wed, Oct 25, 2023 at 07:11:26PM +0200, Lukas Wunner wrote: > On Wed, Oct 25, 2023 at 10:30:07AM -0600, Logan Gunthorpe wrote: > > In addition to the above, P2PDMA transfers are only allowed by the > > kernel for traffic that flows through certain host bridges that are > > known to work. For AMD, all modern CPUs are on this list, but for Intel, > > the list is very patchy. > > This has recently been brought up internally at Intel and nobody could > understand why there's a whitelist in the first place. A long-time PCI > architect told me that Intel silicon validation has been testing P2PDMA > at least since the Lindenhurst days, i.e. since 2005. > > What's the reason for the whitelist? Was there Intel hardware which > didn't support it or turned out to be broken? > > I imagine (but am not certain) that the feature might only be enabled > for server SKUs, is that the reason? No, the reason is that the PCIe spec doesn't require routing of peer-to-peer transactions between Root Ports: https://git.kernel.org/linus/0f97da831026 I think there was a little discussion about adding a firmware interface to advertise this capability, but I guess nobody cared enough to advance it. Bjorn
Bug#679545: Upstream PCI bugzilla for this issue
https://bugzilla.kernel.org/show_bug.cgi?id=48451 -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#679545: ia64, SR870, EFI bug breaks ata_piix, uninitialized ICH4 IDE EXBAR mem resource
On Sat, Sep 29, 2012 at 4:09 PM, Stephan Schreiber i...@fs-driver.org wrote: Hello Bjorn, thank you very much for the patch. I tested it; it works. (typing mistake: it must read PCI_COMMAND_MEMORY instead of PCI_COMMAND_MEM at one location; some hunks of the patch couldn't be applied automatically on Kernel 3.2.23 because some comments in the contexts are different) Thanks a lot for testing this! I'll fix up this typo and work on getting something like this merged. The dmesg output: [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Linux version 3.2.0-3-mckinley (Debian 3.2.23-1) (debian-ker...@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-8) ) #1 SMP Fri Sep 28 21:57:11 CEST 2012 ... [0.065510] pci :00:1f.1: [8086:24cb] type 0 class 0x000101 [0.065524] pci :00:1f.1: reg 10: [io 0x-0x0007] [0.065535] pci :00:1f.1: reg 14: [io 0x-0x0003] [0.065546] pci :00:1f.1: reg 18: [io 0x-0x0007] [0.065556] pci :00:1f.1: reg 1c: [io 0x-0x0003] [0.065567] pci :00:1f.1: reg 20: [io 0x1000-0x100f] [0.065578] pci :00:1f.1: reg 24: [mem 0x-0x03ff unset] ... [1.391380] libata version 3.00 loaded. [1.391922] ata_piix :00:1f.1: version 2.13 [1.391938] ata_piix :00:1f.1: can't derive routing for PCI INT A [1.392493] scsi0 : ata_piix [1.392886] scsi1 : ata_piix [1.393018] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0x1000 irq 34 [1.393066] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x1008 irq 33 [1.557756] ata1.00: ATAPI: HL-DT-ST DVDRAM GSA-T40N, JR03, max UDMA/33 [1.573616] ata1.00: configured for UDMA/33 [1.579147] scsi 0:0:0:0: CD-ROMHL-DT-ST DVDRAM GSA-T40N JR03 PQ: 0 ANSI: 5 [1.590806] sr0: scsi3-mmc drive: 24x/24x writer dvd-ram cd/rw xa/form2 cdda tray [1.590872] cdrom: Uniform CD-ROM driver Revision: 3.20 [1.591272] sr 0:0:0:0: Attached scsi CD-ROM sr0 [1.593910] sr 0:0:0:0: Attached scsi generic sg0 type 5 ... On x86, Windows normally doesn't reconfigure PCI devices unless it finds a problem with the configuration done by the BIOS. I suspect it works similarly on ia64. I would guess that Windows noticed that the MEM bit was not set, and therefore ignored the MEM BAR contents. Since I have the four Windows versions 'for Itanium Based Systems' on that box as well (XP, Server 2003, 2008, 2008 R2), I can tell you more: The Device Manager shows a memory range FFBFFC00-FFBF for the Intel 82801DB Ultra ATA Storage Controller-24CB - on any of these Windows versions. Oh, that's good data, thanks! It looks like Windows noticed that the BAR was invalid and assigned a valid resource to it. That's in the third aperture below: ACPI: PCI Root Bridge [PCI0] (domain [bus 00-01]) pci_root PNP0A03:00: host bridge window [mem 0x000a-0x000f] pci_root PNP0A03:00: host bridge window [mem 0xfa00-0xfbff] pci_root PNP0A03:00: host bridge window [mem 0xff00-0x] pci_root PNP0A03:00: host bridge window [mem 0xfec0-0xfec0] Linux *should* probably do the same (though at a different actual address because we assign bottom-up instead of top-down as Windows does). I don't know off the top of my head whether we actually do in this case or not. What's the output of dmesg | grep :00:1f.1; lspci -vs00:1f.1? -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#679545: [RFC/PATCH v2] ia64, SR870, EFI bug breaks ata_piix, uninitialized ICH4 IDE EXBAR mem resource
On Mon, Sep 24, 2012 at 07:09:12PM +0200, Stephan Schreiber wrote: Mpfff, there aren't many replies; seems I didn't satisfy what you want to have... At first I want to mention that I just want to help the Debian project and started testing Debian Wheezy my old ia64 box. Thanks, I really appreciate that, and you've done a huge amount of debugging and testing already. It's very normal to iterate on the resolution as we're doing now. The firmware left the memory BAR at 0x24 cleared (0x), but it also left the MEM bit in the command register disabled. So it seems like a Linux bug that we're trying to use that zero address from the BAR. If the firmware left the MEM or IO decode enable bit cleared, why would we assume it put anything useful in the corresponding BARs? Your idea would be a fundamental change in the Kernel; I just want to fix the ata_piix problem in Debian Wheezy. Right. I think you've tripped over a rather fundamental issue, and I'm hoping we can fix that. If we can, that will help many users, not just the handful who have this ia64 box. If you would evaluate the command registers, which the BIOS or EFI has initialized, you would work around some wrong BARs. You might run into trouble due to wrong command register values instead. Are you sure that any BIOS or EFI sets the command registers correctly? We can't be 100% sure about things like that, of course. But we do know that if the MEM or IO bits are set in the command register, the device will claim transactions that match whatever is in the BARs. So setting the MEM or IO bit is a pretty strong statement that the BAR contains a valid address. If the BIOS leaves those bits clear, we really can't conclude anything about the BAR contents. Currently the Linux Kernel sets and clears the IORESOURCE_MEM and IORESOURCE_IO bits in the command registers as needed. Windows reconfigures any PCI device. The settings of the BIOS or EFI do not matter at all; the user doesn't experience any BIOS bug at all. On x86, Windows normally doesn't reconfigure PCI devices unless it finds a problem with the configuration done by the BIOS. I suspect it works similarly on ia64. I would guess that Windows noticed that the MEM bit was not set, and therefore ignored the MEM BAR contents. Can you try the following patch? It's based on 3.6-rc5+, but I think it will apply to your 3.2.23 kernel with minor conflicts that shouldn't be too hard to resolve. It's not quite right because we really shouldn't turn on the MEM or IO decode bit unless *all* of the corresponding BARs have been set, but in your case, I think there is only one MEM BAR that is an issue. Bjorn commit 9038dd3b3c4c9e4c7ca0118c8df398c4c646ab58 Author: Bjorn Helgaas bhelg...@google.com Date: Mon Sep 24 17:16:28 2012 -0600 vsprintf: Add support for IORESOURCE_UNSET in %pR diff --git a/lib/vsprintf.c b/lib/vsprintf.c index 0e33754..b6c 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -600,7 +600,7 @@ char *resource_string(char *buf, char *end, struct resource *res, * 64-bit res (sizeof==8): 20 chars in dec, 18 in hex (0x + 16) */ #define RSRC_BUF_SIZE ((2 * sizeof(resource_size_t)) + 4) #define FLAG_BUF_SIZE (2 * sizeof(res-flags)) -#define DECODED_BUF_SIZE sizeof([mem - 64bit pref window disabled]) +#define DECODED_BUF_SIZE sizeof([mem - 64bit pref window unset disabled]) #define RAW_BUF_SIZE sizeof([mem - flags 0x]) char sym[max(2*RSRC_BUF_SIZE + DECODED_BUF_SIZE, 2*RSRC_BUF_SIZE + FLAG_BUF_SIZE + RAW_BUF_SIZE)]; @@ -642,6 +642,8 @@ char *resource_string(char *buf, char *end, struct resource *res, p = string(p, pend, pref, str_spec); if (res-flags IORESOURCE_WINDOW) p = string(p, pend, window, str_spec); + if (res-flags IORESOURCE_UNSET) + p = string(p, pend, unset, str_spec); if (res-flags IORESOURCE_DISABLED) p = string(p, pend, disabled, str_spec); } else { commit f4795a79dc370b6f4106768b16a4a9edba4df933 Author: Bjorn Helgaas bhelg...@google.com Date: Mon Sep 24 17:15:30 2012 -0600 PCI: Ignore BAR contents when firmware left decoding disabled diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index 2396111..6926dcb 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -175,9 +175,10 @@ int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type, mask = type ? PCI_ROM_ADDRESS_MASK : ~0; + pci_read_config_word(dev, PCI_COMMAND, orig_cmd); + /* No printks while decoding is disabled! */ if (!dev-mmio_always_on) { - pci_read_config_word(dev, PCI_COMMAND, orig_cmd); pci_write_config_word(dev, PCI_COMMAND, orig_cmd ~(PCI_COMMAND_MEMORY | PCI_COMMAND_IO)); } @@ -211,9 +212,13 @@ int
Bug#679545: [RFC/PATCH v2] ia64, SR870, EFI bug breaks ata_piix, uninitialized ICH4 IDE EXBAR mem resource
On Thu, Sep 20, 2012 at 8:16 AM, Stephan Schreiber i...@fs-driver.org wrote: description of the symptoms which you have already read on the initial RFC/PATCH== Kernel 3.2.23 with Debian patches (Debian Wheezy, testing) Debian bug#679545 (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=679545) Machine: Dell PowerEdge 3250 (equivalent with Intel SR870BH2) Processor: 2x Itanium Madison 1.5GHz 6M Memory: 4GB Intel ICH4 (82801DB), IDE host adapter. The ata_piix module fails to initialize. A snippet from dmesg: [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Linux version 3.2.0-3-mckinley (Debian 3.2.23-1) (debian-ker...@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-8) ) #1 SMP Mon Jul 23 09:01:02 UTC 2012 ... [0.065516] pci :00:1f.1: [8086:24cb] type 0 class 0x000101 [0.065530] pci :00:1f.1: reg 10: [io 0x-0x0007] [0.065541] pci :00:1f.1: reg 14: [io 0x-0x0003] [0.065552] pci :00:1f.1: reg 18: [io 0x-0x0007] [0.065563] pci :00:1f.1: reg 1c: [io 0x-0x0003] [0.065574] pci :00:1f.1: reg 20: [io 0x1000-0x100f] [0.065585] pci :00:1f.1: reg 24: [mem 0x-0x03ff] ... [1.640965] libata version 3.00 loaded. [1.641656] ata_piix :00:1f.1: version 2.13 [1.641671] ata_piix :00:1f.1: device not available (can't reserve [mem 0x-0x03ff]) [1.641747] ata_piix: probe of :00:1f.1 failed with error -22 ... lspci -vvxxx reports: 00:1f.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller (rev 02) (prog-if 8a [Master SecP PriP]) Subsystem: Intel Corporation Device 3404 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 0 Region 0: I/O ports at 01f0 [size=8] Region 1: I/O ports at 03f4 [size=1] Region 2: I/O ports at 0170 [size=8] Region 3: I/O ports at 0374 [size=1] Region 4: I/O ports at 1000 [size=16] 00: 86 80 cb 24 05 00 80 02 02 8a 01 01 00 00 00 00 10: 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 20: 01 10 00 00 00 00 00 00 00 00 00 00 86 80 04 34 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 40: 03 a3 00 80 00 00 00 00 01 00 02 00 00 00 00 00 50: 00 00 00 00 00 04 00 00 00 00 00 00 00 00 00 00 60: 08 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 00 60 0f 00 00 00 00 00 00 You can read in the Intel 82801DB I/O Controller Hub 4 (ICH4) datasheet (http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/82801db-io-controller-hub-4-datasheet.pdf) about the EXBAR register at offset 0x24 (4 bytes): EXBAR register This is a memory mapped BAR that requires 1 KB of DWord-aligned memory that is Intel reserved for future functionality. BIOS needs to program the base address for a 1-KB memory space. The dump shows that EXBAR is 0x, equal to the default value after reset; EFI doesn't initialize it. ata_piix uses pcim_enable_device() which enables this along with the I/O BARs. In systems based on the Intel SR870 platform the firmware does not initialize the EXBAR and pcim_enable_device() fails because the memory region 0x0-0x3FF cannot be allocated. =description of the symptoms which you have already read on the initial RFC/PATCH My only disagreement here would be putting it in the ia64 paths. If someone does the same for x86-32 (and this is EFI so it'll presumbly smell the same on all platforms) then we'll want the same. Better I think to generically catch the 0/0 case. Alan Here is a new patch. It extends some existing code in pci_setup_device() which maintains some hard-coded io regions on ide controllers in legacy mode. The idea is hiding an uninitialized EXBAR just as on the initial patch. The patch is defensive; it does nothing if - the controller isn't in legacy mode, - BAR5 (EXBAR) isn't a memory resource, or - BAR5 is already initialized. The patch is generic because it works on both x86-32 and ia64 and also for other ICH4 variants than my rare 82801DB_11 ICH4. Even the added 'if' statement of this patch is also executed on IDE controllers of other vendors than Intel or on other Intel ICHs, I believe that it won't break anything. This still isn't very generic. It only looks at BAR
Bug#679545: [RFC/PATCH] ia64, SR870, EFI bug breaks ata_piix, uninitialized ICH4 IDE EXBAR mem resource
On Sun, Sep 16, 2012 at 10:39 AM, Stephan Schreiber i...@fs-driver.org wrote: [0.065516] pci :00:1f.1: [8086:24cb] type 0 class 0x000101 [0.065530] pci :00:1f.1: reg 10: [io 0x-0x0007] [0.065541] pci :00:1f.1: reg 14: [io 0x-0x0003] [0.065552] pci :00:1f.1: reg 18: [io 0x-0x0007] [0.065563] pci :00:1f.1: reg 1c: [io 0x-0x0003] [0.065574] pci :00:1f.1: reg 20: [io 0x1000-0x100f] [0.065585] pci :00:1f.1: reg 24: [mem 0x-0x03ff] ... [1.640965] libata version 3.00 loaded. [1.641656] ata_piix :00:1f.1: version 2.13 [1.641671] ata_piix :00:1f.1: device not available (can't reserve [mem 0x-0x03ff]) [1.641747] ata_piix: probe of :00:1f.1 failed with error -22 ... lspci -vvxxx reports: 00:1f.1 IDE interface: Intel Corporation 82801DB (ICH4) IDE Controller (rev 02) (prog-if 8a [Master SecP PriP]) Subsystem: Intel Corporation Device 3404 Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 0 Region 0: I/O ports at 01f0 [size=8] Region 1: I/O ports at 03f4 [size=1] Region 2: I/O ports at 0170 [size=8] Region 3: I/O ports at 0374 [size=1] Region 4: I/O ports at 1000 [size=16] 00: 86 80 cb 24 05 00 80 02 02 8a 01 01 00 00 00 00 10: 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 20: 01 10 00 00 00 00 00 00 00 00 00 00 86 80 04 34 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 I agree that we should have a generic way to do this rather than an ia64-specific way. In this case you have EFI, but the same thing could happen with BIOS. The firmware left the memory BAR at 0x24 cleared (0x), but it also left the MEM bit in the command register disabled. So it seems like a Linux bug that we're trying to use that zero address from the BAR. If the firmware left the MEM or IO decode enable bit cleared, why would we assume it put anything useful in the corresponding BARs? What would break if we paid attention to the command register enables in the PCI core and just cleared the resource flags for MEM BARs if the MEM-decode bit was off, and those for IO BARs if the IO-decode bit was off? I don't know much of the ancient history here, so maybe there's a good reason why this works the way it currently does. Bjorn -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#543308: [Bug 15362] MPT Fusion SCSI drives no longer appear - suspect PCI bus scan bug
in short: the bios is broken, it return wrong segment in DSDT. I *think* what Yinghai is saying is: - MMCONFIG is not used either in 2.6.26 or 2.6.32. - BIOS reports these host bridges via DSDT PNP0A08 devices: [PCI0] leading to segment bus 00 [PCI1] leading to segment 0001 bus 40 [PCI2] leading to segment 0002 bus 80 - Buses 40 and 80 are actually in segment 0, not segments 1 and 2. - When we enumerate bus 40 and bus 80, we pass seg=1 and seg=2, respectively, to pci_conf1_read(), but 2.6.26 ignores seg. For example, when we think we're reading 0001:40:01.0 config space, 2.6.26 actually reads :40:01.0 config space instead. - In 2.6.32, instead of ignoring seg, we return an error if it is not zero. Therefore, we fail to find anything on bus 40 and bus 80. Sean, what system and BIOS version is this? (The 3.4.x dmesg log or the dmidecode output will contain this information.) I don't expect HP to change the BIOS, and it wouldn't be reasonable to require users to debug this issue and upgrade their BIOS in any case. But I would like to read the release notes or help text that mentions this issue. If all the buses were in fact in segment 0, the DSDT would typically not have any _SEG methods at all, because segment 0 is the default. Yinghai is assuming that HP went to the trouble to *add* _SEG methods that returned incorrect values. But the fact that HP was aware of the issue and provided the BIOS disable ACPI bus segmentation option makes it less likely that this is the case. Also, the system was very likely tested with Windows, and the fact that the BIOS option is to *disable* segmentation suggests that the default is segmentation enabled. So my guess is that segmentation does work with Windows. Sean, can you confirm or deny that? The AIDA64 tool (free trial version at http://www.aida64.com/) generates a report with useful information. I agree with Jonathan's assertion here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=543308#87 that the BIOS switch is not adequate. Neither is a patched DSDT. I think it's likely that Windows works with segmentation, using MMCONFIG, and that Linux is a bit too quick to disable MMCONFIG in this case. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#638863: Errors/warnings show during startup
On Tue, Feb 14, 2012 at 1:53 AM, Jonathan Nieder jrnie...@gmail.com wrote: Hi Ralf, Ralf Jung wrote[1]: after upgrading to version 3.2.0-1 of the kernel, one of the two error messages during startup is gone - the corresponding patch by Bjorn Helgaas has been accepted upstream. However, the timer error is still present: $ dmesg | fgrep TCO [ 11.915459] SP5100 TCO timer: SP5100 TCO WatchDog Timer Driver v0.01 [ 11.915563] SP5100 TCO timer: mmio address 0xfec000f0 already in use (I attached the full dmesg log) Bjorn also wrote patches for these, which I attached as well - however, the upstream discussion about them just stopped at some point, and/or the patches got lost while kernel.org was down. I do not know how this is usually handled. Nice. The usual approach is to resend to the relevant people as a reminder, like you have done now. For reference, here's the last discussion of the two patches you attached: http://thread.gmane.org/gmane.linux.kernel/1184383 If I understand correctly, Cyrill Gorcunov liked the patches. Bjorn, would you like to resend, or should I? I don't think we had clear consensus that my patches were correct. So I don't want to blindly resend them without more consideration. Bjorn -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#638863: shpchp: Cannot reserve MMIO region error during boot (linux 3.0)
On Fri, Aug 26, 2011 at 8:16 AM, Ralf Jung ralfjun...@gmx.de wrote: Hi Bjorn, Here's a test patch for the TCO timer issue. That SP5100 watchdog driver is a mess -- it gropes around at hard-coded places in I/O port space -- so while I think this patch will fix the message, the watchdog itself still may not work. If you can verify that the watchdog works, that would be great. I applied the patches you sent to the list, for both of the issues, and the messages are both gone. (Those address conflict messages I mentioned were already gone with the plain rc3). However, I don't know how to verify that the watchdog works. I installed the watchdog package, and I just did kill -9 watchdog PID, but the system keeps running. The same behaviour is shown with the 3.0 shipped by Debian. Now I don't know if that's the watchdog or my verification method failing ;-) I don't know what's in the watchdog package. I would try the test program in the kernel sources: Documentation/watchdog/src/watchdog-simple.c. It looks like if you kill any other process that has /dev/watchdog open (use lsof to check), then start watchdog-simple, then suspend or kill *it*, you should see a system reset after a minute or two. Thanks for testing all this stuff! Bjorn -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#638863: shpchp: Cannot reserve MMIO region error during boot (linux 3.0)
On Tue, Aug 23, 2011 at 6:13 PM, Bjorn Helgaas bhelg...@google.com wrote: Your error is SP5100 TCO timer: mmio address 0xbafe00 already in use. (Same error, but different address.) That looks like it's in the middle of your RAM, i.e., it looks completely bogus. Given the ugliness of the sp5100_tco driver, that doesn't surprise me. Possibly the BIOS configured it differently and we tried to read the MMIO address from the wrong (hard-coded) I/O ports. If we can dig up a spec for this device, maybe this could be fixed up. I don't really have time to work on this, unfortunately, but here's a little info in case somebody else can. Specs: http://support.amd.com/us/Embedded_TechDocs/44413.pdf SP5100 Register Reference Guide http://support.amd.com/us/Embedded_TechDocs/44414.pdf SP5100 Register Programming Requirements http://support.amd.com/us/Embedded_TechDocs/44415.pdf SP5100 BIOS Developer's Guide I think the BDG has an example putting the watchdog at 0xfec000f0, which is where Ralf's system has it. The power-up default looks like 0, so if you have 0xbafe00, so either the BIOS put it somewhere nonsensical (in the middle of RAM), or we're doing something wrong in reading the address. It's possible we could learn something by booting Windows and seeing whether it uses the watchdog, and at what address. Something like the Device Manager or http://www.aida64.com/ could be useful. Here are some relevant registers from the RRG: 2.3 SMBus Module and ACPI Block 2.3.1 PCI Configuration Registers PCI_Reg 0x90 32 bits Smbus Base Address 2.3.2 SMBus Registers SMBUS register space defined by PCI config 0x90 2.3.3 Legacy ISA and ACPI Controller 2.3.3.1 Legacy Block Registers 2.3.3.1.1 IO-Mapped Control Registers IO_Reg 0xCD6 8 bits PM_Index (p. 163) IO_Reg 0xCD7 8 bits PM_Data 2.3.3.2 Power Management (PM) Registers (p. 165) PM_REG 0x69 8 bits WatchDogTimerControl (p. 190) PM_REG 0x6c 8 bits WatchDogTimerBase0 PM_REG 0x6d 8 bits WatchDogTimerBase1 PM_REG 0x6e 8 bits WatchDogTimerBase2 PM_REG 0x6f 8 bits WatchDogTimerBase3 2.3.4 WatchDogTimer Registers (p. 225) WD_Mem_Reg 0x00 32 bits WatchDogControl WD_Mem_Reg 0x04 32 bits WatchDogCount This is intertwined with piix4. I did notice that piix4_setup() reads the Smbus Base Address at PCI config offset 0x90, and it assumes I/O space. The SP5100 supports either MMIO or I/O, so if your system uses MMIO, things will go wrong. The attached patch checks for that. I haven't worked out the chain from there to the WatchDogTimerBase registers. Bjorn piix4-check-io Description: Binary data
Bug#638863: shpchp: Cannot reserve MMIO region error during boot (linux 3.0)
Hi Ralf, can you attach the complete dmesg log to the bug report, please? I see a snippet (starting with Bluetooth: SCO socket layer initialized), but there's a lot of useful information before that. The dmesg command only shows the most recent part of the log, so if the kernel's buffer has wrapped around, it doesn't show the beginning. The /var/log/dmesg (or similar) file should contain the useful bits. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#638863: shpchp: Cannot reserve MMIO region error during boot (linux 3.0)
Thanks! These tests: if ((dev-vendor == PCI_VENDOR_ID_AMD) || (dev-device == PCI_DEVICE_ID_AMD_GOLAM_7450)) are clearly wrong. I suspect was intended instead of ||, but this code seems to have been that way since the beginning, so I don't know how to verify that. In any event, it would be perfectly legal for any other (non-AMD) manufacturer to make a PCI bridge that uses the 0x7450 device ID, and this code would do the wrong thing with it. I think we should change that || to . That will fix your message and avoid any conflicts with non-AMD devices. It's possible that it will break shpchp on some non-7450 AMD bridges, but we'll just have to deal with those as we discover them. I cc'd a couple AMD folks in case they have comments [see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=638863]. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#638863: shpchp: Cannot reserve MMIO region error during boot (linux 3.0)
Ralf, can you attach your /proc/iomem contents, too? I looked at the SP5100 TCO timer: mmio address 0xfec000f0 already in use message, but I don't see why that address is in use. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#638863: shpchp: Cannot reserve MMIO region error during boot (linux 3.0)
Here's a test patch for the TCO timer issue. That SP5100 watchdog driver is a mess -- it gropes around at hard-coded places in I/O port space -- so while I think this patch will fix the message, the watchdog itself still may not work. If you can verify that the watchdog works, that would be great. [ 0.470960] pci_root PNP0A03:00: address space collision: host bridge window [mem 0x000cc000-0x000c] conflicts with Video ROM [mem 0x000c-0x000ce9ff] [ 0.471052] pci_root PNP0A03:00: address space collision: host bridge window [mem 0x000ec000-0x000e] conflicts with reserved [mem 0x000ef000-0x000f] These are unrelated and I'm doing some other work that addresses them. [ 1.480097] pci :00:04.0: ASPM: Could not configure common clock I don't know about this one. It looks like we tried a link retrain but it failed. I would poke Shaohua Li shaohua...@intel.com about this since he submitted the original code for that. ioapic Description: Binary data
Bug#638863: shpchp: Cannot reserve MMIO region error during boot (linux 3.0)
Your error is SP5100 TCO timer: mmio address 0xbafe00 already in use. (Same error, but different address.) That looks like it's in the middle of your RAM, i.e., it looks completely bogus. Given the ugliness of the sp5100_tco driver, that doesn't surprise me. Possibly the BIOS configured it differently and we tried to read the MMIO address from the wrong (hard-coded) I/O ports. If we can dig up a spec for this device, maybe this could be fixed up. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#481331: usage message fix
Thanks for this patch. Here's a small fix to the usage message: --- mail.orig 2008-05-15 15:50:14.0 -0600 +++ mail2008-05-15 15:51:21.0 -0600 @@ -21,7 +21,7 @@ usage() { - printf $Usage: quilt mail {--mbox file|--send} [-m text] [--prefix prefix] [--sender ...] [--from ...] [--to ...] [--cc ...] [--bcc ...] [--subject ...] [--signature file]\n + printf $Usage: quilt mail {--mbox file|--send} [--select] [-m text] [--prefix prefix] [--sender ...] [--from ...] [--to ...] [--cc ...] [--bcc ...] [--subject ...] [--signature file]\n if [ x$1 = x-h ] then printf $ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#386694: ketchup bracket expression patch
I sent Matt this patch, which solves the problem for me: To complement the character class matched by a bracket expression, the exclamation mark seems more widely accepted than circumflex. Bash accepts either, but dash, ksh, and The Open Group shell command language spec accept only exclamation mark. Dash is installed as /bin/sh on recent Ubuntu systems, and the fact that it doesn't accept circumflex to complement bracket expressions causes errors like this: Unpacking linux-2.6.20.tar.bz2 mv: cannot move `linux-2.6.20/..' to `../..': Device or resource busy Problem reports: https://launchpad.net/bugs/69804 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=386694 http://www.archivum.info/linux.debian.bugs.dist/2006-09/msg02777.html References: bash: http://www.gnu.org/software/bash/manual/bashref.html#SEC34 (sec 3.5.8.1) ksh: http://www.cs.princeton.edu/~jlk/kornshell/doc/man93.html#File Name Generation TOG: http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_13 (sec 2.13.1) --- ketchup.orig2006-05-01 14:09:00.0 -0600 +++ ketchup 2007-04-20 14:15:36.0 -0600 @@ -433,7 +433,7 @@ error(Unpacking failed: , err) sys.exit(-1) -err = os.system(mv linux*/* linux*/.[^.]* ..; rmdir linux*) +err = os.system(mv linux*/* linux*/.[!.]* ..; rmdir linux*) if err: error(Unpacking failed: , err) sys.exit(-1) -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#284111: xserver-xfree86: Doesn't scan PCI domains above 0000 on startup
On Thu, 2005-01-13 at 17:59 -0500, Branden Robinson wrote: On Fri, Dec 31, 2004 at 12:18:29AM -0800, David Mosberger wrote: Branden I wonder how many domains we should look for before we give Branden up. I get the feeling doing an ftw() on /proc/pci/pci is Branden not a good idea. Even doing as much as a readdir() feels Branden wrong, but maybe not. :-P I think readdir() should be kosher. ls in /proc/bus/pci has to get its data somehow. But I'm confused about why we would iterate through all the domains anyway, since we don't seem to iterate through all PCI buses. Maybe X's PCITAG doesn't include a domain, or maybe there's no config file syntax for specifying it? I'm not terribly familiar with multi-domain machines. From what I recall, the domain-changes to /proc/bus/pci were SPARC-specific and I'm not sure whether that approach is the final answer. The domain changes to /proc/bus/pci are implemented for sparc64, ppc64, ia64, alpha, and mips (see pci_name_bus()). They aren't all identical (sparc64 uses %04x:%02x always, while ia64 uses %04x:%02x only for non-zero domain), but they look close enough that one could try %02x first, then %04x:%02x. I added Matthew Wilcox in case he has additional input. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]