Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]
On 2020-05-27 23:38, John Baldwin wrote: No. I get that constantly on a desktop that never suspends/resumes. It only started after upgrading to 12.0. If you have time, could you investigate why the USB host controllers Root HUB PCI register flips to -1U ? Which cause these spurious events ... Maybe some kind of PCI power save feature which is not timed correctly ... --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]
On 5/27/20 2:05 PM, Hans Petter Selasky wrote: > On 2020-05-27 15:41, Justin Hibbits wrote: >> On Wed, 27 May 2020 06:27:16 -0700 >> John Baldwin wrote: >> >>> On 5/27/20 2:39 AM, Andriy Gapon wrote: On 27/05/2020 11:13, Andriy Gapon wrote: > I added more diagnostics and it seems to support the idea that the > problem is related to I/O cycles and bridges. > > ACPI timer suddenly starts returning 0x and that lasts for > tens of microseconds before the timer goes back to returning > normal values with an expected increase. > AMD provides a proprietary way to access ACPI registers via MMIO > (0xfed808xx). That mechanism is unaffected, ACPI timer register > always returns good values. > > The problem seems to happen when restoring configuration of a > particular PCI bridge. What's interesting is that the bridge > decodes one memory range and one I/O range. > > Looking at pci_cfg_restore() I wonder if it is wise to restore > PCIR_COMMAND so early. Could it be that after the resume the > bridge is configured with a wrong I/O range (e.g., too wide) and > by writing PCIR_COMMAND we enable that decoding. So, the bridge > steals I/O cycles destined for ACPI support hardware. If there is > nothing behind the bridge to handle those ports, then we get those > bad readings. Once the bridge configuration is fully restored, the > I/O handling goes back to normal. From what I see, this looks like a BIOS bug. Upon resume, it swaps window configurations of pcib1 and pcib2 (until FreeBSD restores them). pcib1 originally does not have an I/O window. So, BIOS programs both base and limit of pcib2 I/O window to zero. When FreeBSD writes its command register to enable I/O decoding it starts claiming 0x0 - 0xFFF I/O port range. That covers the ACPI ports at 0x8xx. Some printf-s. From (verbose) boot time: pcib1: domain0 pcib1: secondary bus 1 pcib1: subordinate bus 1 pcib1: memory decode 0xfea0-0xfeaf pcib2: domain0 pcib2: secondary bus 2 pcib2: subordinate bus 2 pcib2: I/O decode0xf000-0x pcib2: memory decode 0xfe90-0xfe9f My printf-s from resume time: pcib1: old I/O base (low): 0xf1 pcib1: old I/O base (high): 0x0 pcib1: old I/O limit (low): 0x1 pcib1: old I/O limit (high): 0x0 pcib2: old I/O base (low): 0x1 pcib2: old I/O base (high): 0x0 pcib2: old I/O limit (low): 0x1 pcib2: old I/O limit (high): 0x0 >>> >>> The "solution" I think is to have resume be multi-pass and to resume >>> all the bridges first before trying to resume leaf devices (including >>> timers), but that's a fair bit of work. It might be that we just >>> need to resume timer interrupts later after the new-bus resume (I >>> think we currently do it before?), though the reason for that was to >>> allow resume methods in devices to sleep (I'm not sure if any do). >>> >> >> That sounds like a good fit for https://reviews.freebsd.org/D203 . >> Someone (TM) just needs to take it over the finish line... 6 years >> later. > > Is this perhaps related to: > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237666 No. I get that constantly on a desktop that never suspends/resumes. It only started after upgrading to 12.0. -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]
On 2020-05-27 15:41, Justin Hibbits wrote: On Wed, 27 May 2020 06:27:16 -0700 John Baldwin wrote: On 5/27/20 2:39 AM, Andriy Gapon wrote: On 27/05/2020 11:13, Andriy Gapon wrote: I added more diagnostics and it seems to support the idea that the problem is related to I/O cycles and bridges. ACPI timer suddenly starts returning 0x and that lasts for tens of microseconds before the timer goes back to returning normal values with an expected increase. AMD provides a proprietary way to access ACPI registers via MMIO (0xfed808xx). That mechanism is unaffected, ACPI timer register always returns good values. The problem seems to happen when restoring configuration of a particular PCI bridge. What's interesting is that the bridge decodes one memory range and one I/O range. Looking at pci_cfg_restore() I wonder if it is wise to restore PCIR_COMMAND so early. Could it be that after the resume the bridge is configured with a wrong I/O range (e.g., too wide) and by writing PCIR_COMMAND we enable that decoding. So, the bridge steals I/O cycles destined for ACPI support hardware. If there is nothing behind the bridge to handle those ports, then we get those bad readings. Once the bridge configuration is fully restored, the I/O handling goes back to normal. From what I see, this looks like a BIOS bug. Upon resume, it swaps window configurations of pcib1 and pcib2 (until FreeBSD restores them). pcib1 originally does not have an I/O window. So, BIOS programs both base and limit of pcib2 I/O window to zero. When FreeBSD writes its command register to enable I/O decoding it starts claiming 0x0 - 0xFFF I/O port range. That covers the ACPI ports at 0x8xx. Some printf-s. From (verbose) boot time: pcib1: domain0 pcib1: secondary bus 1 pcib1: subordinate bus 1 pcib1: memory decode 0xfea0-0xfeaf pcib2: domain0 pcib2: secondary bus 2 pcib2: subordinate bus 2 pcib2: I/O decode0xf000-0x pcib2: memory decode 0xfe90-0xfe9f My printf-s from resume time: pcib1: old I/O base (low): 0xf1 pcib1: old I/O base (high): 0x0 pcib1: old I/O limit (low): 0x1 pcib1: old I/O limit (high): 0x0 pcib2: old I/O base (low): 0x1 pcib2: old I/O base (high): 0x0 pcib2: old I/O limit (low): 0x1 pcib2: old I/O limit (high): 0x0 The "solution" I think is to have resume be multi-pass and to resume all the bridges first before trying to resume leaf devices (including timers), but that's a fair bit of work. It might be that we just need to resume timer interrupts later after the new-bus resume (I think we currently do it before?), though the reason for that was to allow resume methods in devices to sleep (I'm not sure if any do). That sounds like a good fit for https://reviews.freebsd.org/D203 . Someone (TM) just needs to take it over the finish line... 6 years later. Is this perhaps related to: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=237666 --HPS ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]
On Wed, 27 May 2020 06:27:16 -0700 John Baldwin wrote: > On 5/27/20 2:39 AM, Andriy Gapon wrote: > > On 27/05/2020 11:13, Andriy Gapon wrote: > >> I added more diagnostics and it seems to support the idea that the > >> problem is related to I/O cycles and bridges. > >> > >> ACPI timer suddenly starts returning 0x and that lasts for > >> tens of microseconds before the timer goes back to returning > >> normal values with an expected increase. > >> AMD provides a proprietary way to access ACPI registers via MMIO > >> (0xfed808xx). That mechanism is unaffected, ACPI timer register > >> always returns good values. > >> > >> The problem seems to happen when restoring configuration of a > >> particular PCI bridge. What's interesting is that the bridge > >> decodes one memory range and one I/O range. > >> > >> Looking at pci_cfg_restore() I wonder if it is wise to restore > >> PCIR_COMMAND so early. Could it be that after the resume the > >> bridge is configured with a wrong I/O range (e.g., too wide) and > >> by writing PCIR_COMMAND we enable that decoding. So, the bridge > >> steals I/O cycles destined for ACPI support hardware. If there is > >> nothing behind the bridge to handle those ports, then we get those > >> bad readings. Once the bridge configuration is fully restored, the > >> I/O handling goes back to normal. > > > > From what I see, this looks like a BIOS bug. > > Upon resume, it swaps window configurations of pcib1 and pcib2 > > (until FreeBSD restores them). pcib1 originally does not have an > > I/O window. So, BIOS programs both base and limit of pcib2 I/O > > window to zero. When FreeBSD writes its command register to > > enable I/O decoding it starts claiming 0x0 - 0xFFF I/O port range. > > That covers the ACPI ports at 0x8xx. > > > > Some printf-s. > > From (verbose) boot time: > > pcib1: domain0 > > pcib1: secondary bus 1 > > pcib1: subordinate bus 1 > > pcib1: memory decode 0xfea0-0xfeaf > > pcib2: domain0 > > pcib2: secondary bus 2 > > pcib2: subordinate bus 2 > > pcib2: I/O decode0xf000-0x > > pcib2: memory decode 0xfe90-0xfe9f > > > > My printf-s from resume time: > > pcib1: old I/O base (low): 0xf1 > > pcib1: old I/O base (high): 0x0 > > pcib1: old I/O limit (low): 0x1 > > pcib1: old I/O limit (high): 0x0 > > pcib2: old I/O base (low): 0x1 > > pcib2: old I/O base (high): 0x0 > > pcib2: old I/O limit (low): 0x1 > > pcib2: old I/O limit (high): 0x0 > > The "solution" I think is to have resume be multi-pass and to resume > all the bridges first before trying to resume leaf devices (including > timers), but that's a fair bit of work. It might be that we just > need to resume timer interrupts later after the new-bus resume (I > think we currently do it before?), though the reason for that was to > allow resume methods in devices to sleep (I'm not sure if any do). > That sounds like a good fit for https://reviews.freebsd.org/D203 . Someone (TM) just needs to take it over the finish line... 6 years later. - Justin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]
On 27/05/2020 16:27, John Baldwin wrote: > The "solution" I think is to have resume be multi-pass and to resume all the > bridges > first before trying to resume leaf devices (including timers), but that's a > fair bit > of work. It might be that we just need to resume timer interrupts later > after the > new-bus resume (I think we currently do it before?), though the reason for > that was > to allow resume methods in devices to sleep (I'm not sure if any do). But it's not only about timers. {sbin,bin,micro,etc}uptime() calls can return garbage as well and confuse their callers. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]
On 5/27/20 2:39 AM, Andriy Gapon wrote: > On 27/05/2020 11:13, Andriy Gapon wrote: >> I added more diagnostics and it seems to support the idea that the problem is >> related to I/O cycles and bridges. >> >> ACPI timer suddenly starts returning 0x and that lasts for tens of >> microseconds before the timer goes back to returning normal values with an >> expected increase. >> AMD provides a proprietary way to access ACPI registers via MMIO >> (0xfed808xx). >> That mechanism is unaffected, ACPI timer register always returns good values. >> >> The problem seems to happen when restoring configuration of a particular PCI >> bridge. What's interesting is that the bridge decodes one memory range and >> one >> I/O range. >> >> Looking at pci_cfg_restore() I wonder if it is wise to restore PCIR_COMMAND >> so >> early. Could it be that after the resume the bridge is configured with a >> wrong >> I/O range (e.g., too wide) and by writing PCIR_COMMAND we enable that >> decoding. >> So, the bridge steals I/O cycles destined for ACPI support hardware. If >> there >> is nothing behind the bridge to handle those ports, then we get those bad >> readings. >> Once the bridge configuration is fully restored, the I/O handling goes back >> to >> normal. > > From what I see, this looks like a BIOS bug. > Upon resume, it swaps window configurations of pcib1 and pcib2 (until FreeBSD > restores them). pcib1 originally does not have an I/O window. So, BIOS > programs both base and limit of pcib2 I/O window to zero. When FreeBSD > writes > its command register to enable I/O decoding it starts claiming 0x0 - 0xFFF I/O > port range. That covers the ACPI ports at 0x8xx. > > Some printf-s. > From (verbose) boot time: > pcib1: domain0 > pcib1: secondary bus 1 > pcib1: subordinate bus 1 > pcib1: memory decode 0xfea0-0xfeaf > pcib2: domain0 > pcib2: secondary bus 2 > pcib2: subordinate bus 2 > pcib2: I/O decode0xf000-0x > pcib2: memory decode 0xfe90-0xfe9f > > My printf-s from resume time: > pcib1: old I/O base (low): 0xf1 > pcib1: old I/O base (high): 0x0 > pcib1: old I/O limit (low): 0x1 > pcib1: old I/O limit (high): 0x0 > pcib2: old I/O base (low): 0x1 > pcib2: old I/O base (high): 0x0 > pcib2: old I/O limit (low): 0x1 > pcib2: old I/O limit (high): 0x0 The "solution" I think is to have resume be multi-pass and to resume all the bridges first before trying to resume leaf devices (including timers), but that's a fair bit of work. It might be that we just need to resume timer interrupts later after the new-bus resume (I think we currently do it before?), though the reason for that was to allow resume methods in devices to sleep (I'm not sure if any do). -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]
On 27/05/2020 11:13, Andriy Gapon wrote: > I added more diagnostics and it seems to support the idea that the problem is > related to I/O cycles and bridges. > > ACPI timer suddenly starts returning 0x and that lasts for tens of > microseconds before the timer goes back to returning normal values with an > expected increase. > AMD provides a proprietary way to access ACPI registers via MMIO (0xfed808xx). > That mechanism is unaffected, ACPI timer register always returns good values. > > The problem seems to happen when restoring configuration of a particular PCI > bridge. What's interesting is that the bridge decodes one memory range and > one > I/O range. > > Looking at pci_cfg_restore() I wonder if it is wise to restore PCIR_COMMAND so > early. Could it be that after the resume the bridge is configured with a > wrong > I/O range (e.g., too wide) and by writing PCIR_COMMAND we enable that > decoding. > So, the bridge steals I/O cycles destined for ACPI support hardware. If > there > is nothing behind the bridge to handle those ports, then we get those bad > readings. > Once the bridge configuration is fully restored, the I/O handling goes back to > normal. >From what I see, this looks like a BIOS bug. Upon resume, it swaps window configurations of pcib1 and pcib2 (until FreeBSD restores them). pcib1 originally does not have an I/O window. So, BIOS programs both base and limit of pcib2 I/O window to zero. When FreeBSD writes its command register to enable I/O decoding it starts claiming 0x0 - 0xFFF I/O port range. That covers the ACPI ports at 0x8xx. Some printf-s. >From (verbose) boot time: pcib1: domain0 pcib1: secondary bus 1 pcib1: subordinate bus 1 pcib1: memory decode 0xfea0-0xfeaf pcib2: domain0 pcib2: secondary bus 2 pcib2: subordinate bus 2 pcib2: I/O decode0xf000-0x pcib2: memory decode 0xfe90-0xfe9f My printf-s from resume time: pcib1: old I/O base (low): 0xf1 pcib1: old I/O base (high): 0x0 pcib1: old I/O limit (low): 0x1 pcib1: old I/O limit (high): 0x0 pcib2: old I/O base (low): 0x1 pcib2: old I/O base (high): 0x0 pcib2: old I/O limit (low): 0x1 pcib2: old I/O limit (high): 0x0 -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]
On 27/05/2020 01:14, John Baldwin wrote: > On 5/26/20 11:55 AM, Konstantin Belousov wrote: >> On Tue, May 26, 2020 at 06:22:13PM +0300, Andriy Gapon wrote: >>> I am not sure if this is just a coincidence but it appears as if a write to >>> some >>> PCI configuration register could temporarily interfere with access to the PM >>> timer I/O port. >>> Is that plausible? >> If something disabled a BAR, then typical response of x86 chipset for timed >> out read from PCIe is 0xf... . > > And the ACPI timer might be "behind" the isab0 bridge device which would > indeed > cause this. I added more diagnostics and it seems to support the idea that the problem is related to I/O cycles and bridges. ACPI timer suddenly starts returning 0x and that lasts for tens of microseconds before the timer goes back to returning normal values with an expected increase. AMD provides a proprietary way to access ACPI registers via MMIO (0xfed808xx). That mechanism is unaffected, ACPI timer register always returns good values. The problem seems to happen when restoring configuration of a particular PCI bridge. What's interesting is that the bridge decodes one memory range and one I/O range. Looking at pci_cfg_restore() I wonder if it is wise to restore PCIR_COMMAND so early. Could it be that after the resume the bridge is configured with a wrong I/O range (e.g., too wide) and by writing PCIR_COMMAND we enable that decoding. So, the bridge steals I/O cycles destined for ACPI support hardware. If there is nothing behind the bridge to handle those ports, then we get those bad readings. Once the bridge configuration is fully restored, the I/O handling goes back to normal. Is this possible? P.S. pci_cfg_restore() also attempts to restore PCIR_INTPIN, but it's read-only? -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]
On 5/26/20 11:55 AM, Konstantin Belousov wrote: > On Tue, May 26, 2020 at 06:22:13PM +0300, Andriy Gapon wrote: >> On 25/05/2020 11:37, Andriy Gapon wrote: >>> Also, there is another issue related to atrtc. >>> When I have both drivers attached, and also when I have only atrtc attached >>> (efi.rt.disabled=1), system clock jumps 10 minutes forward after each >>> suspend / >>> resume cycle (S0 -> S3 -> S0). That does not happen for reboot and shutdown >>> cycles. I haven't investigated this deeper, but it is a curious problem. >> >> Actually, I was wrong. The problem can also occur with efirtc alone. >> Also, sometimes there is a different problem where there are no callouts for >> a >> period of time on the order of minutes. I tracked it to cc_lastscan being >> set >> to a value greater than the current uptime. So, any scheduled callout gets >> scheduled at cc_lastscan and it is a while before the uptime catches up. >> >> It seemed that both issues were connected and were a result of the uptime >> jumping forward by some minutes and then jumping back to a sane value. >> If something important happened during the weird period, like getting time of >> day from hardware or invoking a callout, it lead to the observed effects. >> >> So, that gave me some ideas where to add debugging checks. >> What I determined is that ACPI timer (ACPI-fast) could produce a reading of >> all >> 1-s like happens when there is no hardware response. >> >> I caught one such instance and got a stack trace for it (but no crash dump >> because devices had not resumed yet): >> tc_windup() at tc_windup+0x318/frame 0xfe00a7a19300 >> tc_ticktock() at tc_ticktock+0x4b/frame 0xfe00a7a19320 >> hardclock() at hardclock+0x107/frame 0xfe00a7a19360 >> handleevents() at handleevents+0xb3/frame 0xfe00a7a193a0 >> timercb() at timercb+0x196/frame 0xfe00a7a193f0 >> lapic_handle_timer() at lapic_handle_timer+0x98/frame 0xfe00a7a19420 >> Xtimerint() at Xtimerint+0xb1/frame 0xfe00a7a19420 >> --- interrupt, rip = 0x80b34500, rsp = 0xfe00a7a194f8, rbp = >> 0xfe00a7a19540 --- >> acpi_pcib_write_config() at acpi_pcib_write_config/frame 0xfe00a7a19540 >> pci_cfg_restore() at pci_cfg_restore+0x2cc/frame 0xfe00a7a195a0 >> pci_resume_child() at pci_resume_child+0xee/frame 0xfe00a7a195e0 >> pci_resume() at pci_resume+0x49/frame 0xfe00a7a19630 >> bus_generic_resume_child() at bus_generic_resume_child+0x43/frame >> 0xfe00a7a19650 >> bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a19680 >> bus_generic_resume_child() at bus_generic_resume_child+0x43/frame >> 0xfe00a7a196a0 >> bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a196d0 >> bus_generic_resume_child() at bus_generic_resume_child+0x43/frame >> 0xfe00a7a196f0 >> bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a19720 >> bus_generic_resume_child() at bus_generic_resume_child+0x43/frame >> 0xfe00a7a19740 >> root_resume() at root_resume+0x29/frame 0xfe00a7a19770 >> acpi_EnterSleepState() at acpi_EnterSleepState+0x73b/frame 0xfe00a7a197f0 >> acpi_AckSleepState() at acpi_AckSleepState+0x144/frame 0xfe00a7a19820 >> devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfe00a7a19870 >> vn_ioctl() at vn_ioctl+0x132/frame 0xfe00a7a19980 >> devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfe00a7a199a0 >> kern_ioctl() at kern_ioctl+0x27b/frame 0xfe00a7a19a00 >> sys_ioctl() at sys_ioctl+0x123/frame 0xfe00a7a19ad0 >> amd64_syscall() at amd64_syscall+0x140/frame 0xfe00a7a19bf0 >> fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe00a7a19bf0 >> >> I am not sure if this is just a coincidence but it appears as if a write to >> some >> PCI configuration register could temporarily interfere with access to the PM >> timer I/O port. >> Is that plausible? > If something disabled a BAR, then typical response of x86 chipset for timed > out read from PCIe is 0xf... . And the ACPI timer might be "behind" the isab0 bridge device which would indeed cause this. -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]
On Tue, May 26, 2020 at 06:22:13PM +0300, Andriy Gapon wrote: > On 25/05/2020 11:37, Andriy Gapon wrote: > > Also, there is another issue related to atrtc. > > When I have both drivers attached, and also when I have only atrtc attached > > (efi.rt.disabled=1), system clock jumps 10 minutes forward after each > > suspend / > > resume cycle (S0 -> S3 -> S0). That does not happen for reboot and shutdown > > cycles. I haven't investigated this deeper, but it is a curious problem. > > Actually, I was wrong. The problem can also occur with efirtc alone. > Also, sometimes there is a different problem where there are no callouts for a > period of time on the order of minutes. I tracked it to cc_lastscan being set > to a value greater than the current uptime. So, any scheduled callout gets > scheduled at cc_lastscan and it is a while before the uptime catches up. > > It seemed that both issues were connected and were a result of the uptime > jumping forward by some minutes and then jumping back to a sane value. > If something important happened during the weird period, like getting time of > day from hardware or invoking a callout, it lead to the observed effects. > > So, that gave me some ideas where to add debugging checks. > What I determined is that ACPI timer (ACPI-fast) could produce a reading of > all > 1-s like happens when there is no hardware response. > > I caught one such instance and got a stack trace for it (but no crash dump > because devices had not resumed yet): > tc_windup() at tc_windup+0x318/frame 0xfe00a7a19300 > tc_ticktock() at tc_ticktock+0x4b/frame 0xfe00a7a19320 > hardclock() at hardclock+0x107/frame 0xfe00a7a19360 > handleevents() at handleevents+0xb3/frame 0xfe00a7a193a0 > timercb() at timercb+0x196/frame 0xfe00a7a193f0 > lapic_handle_timer() at lapic_handle_timer+0x98/frame 0xfe00a7a19420 > Xtimerint() at Xtimerint+0xb1/frame 0xfe00a7a19420 > --- interrupt, rip = 0x80b34500, rsp = 0xfe00a7a194f8, rbp = > 0xfe00a7a19540 --- > acpi_pcib_write_config() at acpi_pcib_write_config/frame 0xfe00a7a19540 > pci_cfg_restore() at pci_cfg_restore+0x2cc/frame 0xfe00a7a195a0 > pci_resume_child() at pci_resume_child+0xee/frame 0xfe00a7a195e0 > pci_resume() at pci_resume+0x49/frame 0xfe00a7a19630 > bus_generic_resume_child() at bus_generic_resume_child+0x43/frame > 0xfe00a7a19650 > bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a19680 > bus_generic_resume_child() at bus_generic_resume_child+0x43/frame > 0xfe00a7a196a0 > bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a196d0 > bus_generic_resume_child() at bus_generic_resume_child+0x43/frame > 0xfe00a7a196f0 > bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a19720 > bus_generic_resume_child() at bus_generic_resume_child+0x43/frame > 0xfe00a7a19740 > root_resume() at root_resume+0x29/frame 0xfe00a7a19770 > acpi_EnterSleepState() at acpi_EnterSleepState+0x73b/frame 0xfe00a7a197f0 > acpi_AckSleepState() at acpi_AckSleepState+0x144/frame 0xfe00a7a19820 > devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfe00a7a19870 > vn_ioctl() at vn_ioctl+0x132/frame 0xfe00a7a19980 > devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfe00a7a199a0 > kern_ioctl() at kern_ioctl+0x27b/frame 0xfe00a7a19a00 > sys_ioctl() at sys_ioctl+0x123/frame 0xfe00a7a19ad0 > amd64_syscall() at amd64_syscall+0x140/frame 0xfe00a7a19bf0 > fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe00a7a19bf0 > > I am not sure if this is just a coincidence but it appears as if a write to > some > PCI configuration register could temporarily interfere with access to the PM > timer I/O port. > Is that plausible? If something disabled a BAR, then typical response of x86 chipset for timed out read from PCIe is 0xf... . ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
acpi timer reads all ones [Was: efirtc + atrtc at the same time]
On 25/05/2020 11:37, Andriy Gapon wrote: > Also, there is another issue related to atrtc. > When I have both drivers attached, and also when I have only atrtc attached > (efi.rt.disabled=1), system clock jumps 10 minutes forward after each suspend > / > resume cycle (S0 -> S3 -> S0). That does not happen for reboot and shutdown > cycles. I haven't investigated this deeper, but it is a curious problem. Actually, I was wrong. The problem can also occur with efirtc alone. Also, sometimes there is a different problem where there are no callouts for a period of time on the order of minutes. I tracked it to cc_lastscan being set to a value greater than the current uptime. So, any scheduled callout gets scheduled at cc_lastscan and it is a while before the uptime catches up. It seemed that both issues were connected and were a result of the uptime jumping forward by some minutes and then jumping back to a sane value. If something important happened during the weird period, like getting time of day from hardware or invoking a callout, it lead to the observed effects. So, that gave me some ideas where to add debugging checks. What I determined is that ACPI timer (ACPI-fast) could produce a reading of all 1-s like happens when there is no hardware response. I caught one such instance and got a stack trace for it (but no crash dump because devices had not resumed yet): tc_windup() at tc_windup+0x318/frame 0xfe00a7a19300 tc_ticktock() at tc_ticktock+0x4b/frame 0xfe00a7a19320 hardclock() at hardclock+0x107/frame 0xfe00a7a19360 handleevents() at handleevents+0xb3/frame 0xfe00a7a193a0 timercb() at timercb+0x196/frame 0xfe00a7a193f0 lapic_handle_timer() at lapic_handle_timer+0x98/frame 0xfe00a7a19420 Xtimerint() at Xtimerint+0xb1/frame 0xfe00a7a19420 --- interrupt, rip = 0x80b34500, rsp = 0xfe00a7a194f8, rbp = 0xfe00a7a19540 --- acpi_pcib_write_config() at acpi_pcib_write_config/frame 0xfe00a7a19540 pci_cfg_restore() at pci_cfg_restore+0x2cc/frame 0xfe00a7a195a0 pci_resume_child() at pci_resume_child+0xee/frame 0xfe00a7a195e0 pci_resume() at pci_resume+0x49/frame 0xfe00a7a19630 bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 0xfe00a7a19650 bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a19680 bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 0xfe00a7a196a0 bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a196d0 bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 0xfe00a7a196f0 bus_generic_resume() at bus_generic_resume+0x29/frame 0xfe00a7a19720 bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 0xfe00a7a19740 root_resume() at root_resume+0x29/frame 0xfe00a7a19770 acpi_EnterSleepState() at acpi_EnterSleepState+0x73b/frame 0xfe00a7a197f0 acpi_AckSleepState() at acpi_AckSleepState+0x144/frame 0xfe00a7a19820 devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfe00a7a19870 vn_ioctl() at vn_ioctl+0x132/frame 0xfe00a7a19980 devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfe00a7a199a0 kern_ioctl() at kern_ioctl+0x27b/frame 0xfe00a7a19a00 sys_ioctl() at sys_ioctl+0x123/frame 0xfe00a7a19ad0 amd64_syscall() at amd64_syscall+0x140/frame 0xfe00a7a19bf0 fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe00a7a19bf0 I am not sure if this is just a coincidence but it appears as if a write to some PCI configuration register could temporarily interfere with access to the PM timer I/O port. Is that plausible? I'll try to dig up more data. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: efirtc + atrtc at the same time
On Mon, 2020-05-25 at 11:37 +0300, Andriy Gapon wrote: > I see that on my laptop both efirtc and atrtc get attached. > The latter is via an ACPI attachment: > efirtc0: > efirtc0: registered as a time-of-day clock, resolution 1.00s > atrtc0: port 0x70-0x71 on acpi0 > atrtc0: registered as a time-of-day clock, resolution 1.00s > > I am not sure if this is a problem by itself, but it certainly seems redundant > to have two drivers controlling the same(?) hardware via different platform > mechanisms. > Maybe there is a nice way to automatically disable (or "neutralize") one of > the > drivers? > I thought I had done something long ago to prevent atrtc and efirtc from both attaching, but apparently not. I intended to, I even mentioned it in https://reviews.freebsd.org/D14399 but it looks like I never followed up and did the work. > Also, there is another issue related to atrtc. > When I have both drivers attached, and also when I have only atrtc attached > (efi.rt.disabled=1), system clock jumps 10 minutes forward after each suspend > / > resume cycle (S0 -> S3 -> S0). That does not happen for reboot and shutdown > cycles. I haven't investigated this deeper, but it is a curious problem. > I've looked at the code for messing with the clock around suspend/resume and never felt like it was doing the right thing (or even anything useful). But I've never owned a freebsd machine that could successfully resume from suspend, so I've never been able to experiment with it. -- Ian ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
efirtc + atrtc at the same time
I see that on my laptop both efirtc and atrtc get attached. The latter is via an ACPI attachment: efirtc0: efirtc0: registered as a time-of-day clock, resolution 1.00s atrtc0: port 0x70-0x71 on acpi0 atrtc0: registered as a time-of-day clock, resolution 1.00s I am not sure if this is a problem by itself, but it certainly seems redundant to have two drivers controlling the same(?) hardware via different platform mechanisms. Maybe there is a nice way to automatically disable (or "neutralize") one of the drivers? Also, there is another issue related to atrtc. When I have both drivers attached, and also when I have only atrtc attached (efi.rt.disabled=1), system clock jumps 10 minutes forward after each suspend / resume cycle (S0 -> S3 -> S0). That does not happen for reboot and shutdown cycles. I haven't investigated this deeper, but it is a curious problem. -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"