Re: bsd.mp hits witness panic under vmm (single CPU)
On Thu, Jun 07, 2018 at 07:22:23PM -0700, Philip Guenther wrote: > On Thu, 7 Jun 2018, Mike Larkin wrote: > > Is this a panic inside the guest in vmm, or is this the host panicing when > > you're doing something while a VM is running in vmm on that host? > > > > Can't really tell from the trace here... > > This was a guest panicing. visa@ thinks this is the same intr_legacy8 > panic as reported previously. > Could be!
Re: bsd.mp hits witness panic under vmm (single CPU)
On Thu, 7 Jun 2018, Mike Larkin wrote: > Is this a panic inside the guest in vmm, or is this the host panicing when > you're doing something while a VM is running in vmm on that host? > > Can't really tell from the trace here... This was a guest panicing. visa@ thinks this is the same intr_legacy8 panic as reported previously.
Re: bsd.mp hits witness panic under vmm (single CPU)
On Thu, Jun 07, 2018 at 05:13:06PM -0700, Philip Guenther wrote: > > The GENERIC bsd kernel is happy under vmm, but booting a GENERIC.MP kernel > hits a witness panic. I suspect some "one CPU only" optimization is > resulting in the witness code being misinformed. > > Here's the boot output in the vmm console. (Yes, the userland is out of > date, but that shouldn't lead to a witness panic either.) > > > (The weird "show witness" output for scsi_base.c mutexes is because > they're on the stack and need to be unlinked from witness before > returning; that *might* be causing the problem here, but I doubt it. I'm > starting on a diff for that part...) > > > Philip Guenther > Is this a panic inside the guest in vmm, or is this the host panicing when you're doing something while a VM is running in vmm on that host? Can't really tell from the trace here... -ml > --- > Copyright (c) 1982, 1986, 1989, 1991, 1993 > The Regents of the University of California. All rights reserved. > Copyright (c) 1995-2018 OpenBSD. All rights reserved. https://www.OpenBSD.org > > OpenBSD 6.3-current (GENERIC.MP) #25: Thu Jun 7 16:29:55 PDT 2018 > > guenther@morgaine.local:/usr/src/sys-realclean/arch/amd64/compile/GENERIC.MP > real mem = 520093696 (496MB) > avail mem = 485457920 (462MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0 > acpi at bios0 not configured > cpu0 at mainbus0: (uniprocessor) > cpu0: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz, 2594.54 MHz > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,RDSEED,ADX,SMAP,MELTDOWN > cpu0: 256KB 64b/line 8-way L2 cache > cpu0: smt 0, core 0, package 0 > pvbus0 at mainbus0: OpenBSD > pci0 at mainbus0 bus 0 > pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00 > virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00 > viornd0 at virtio0 > virtio0: irq 3 > virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Storage" rev 0x00 > vioblk0 at virtio1 > scsibus1 at vioblk0: 2 targets > sd0 at scsibus1 targ 0 lun 0: SCSI3 0/direct fixed > sd0: 4096MB, 512 bytes/sector, 8388608 sectors > virtio1: irq 5 > virtio2 at pci0 dev 3 function 0 "OpenBSD VMM Control" rev 0x00 > vmmci0 at virtio2 > virtio2: irq 6 > isa0 at mainbus0 > isadma0 at isa0 > com0 at isa0 port 0x3f8/8 irq 4: ns16450, no fifo > com0: console > vscsi0 at root > scsibus2 at vscsi0: 256 targets > softraid0 at root > scsibus3 at softraid0: 256 targets > root on sd0a (0084d990f4e53393.a) swap on sd0b dump on sd0b > Automatic boot in progress: starting file system checks. > /dev/sd0a (0084d990f4e53393.a): file system is clean; not checking > /dev/sd0e (0084d990f4e53393.e): file system is clean; not checking > /dev/sd0d (0084d990f4e53393.d): file system is clean; not checking > setting tty flags > pfctl: pfctl_rules > pfctl: DIOCXROLLBACK: Invalid argument > pf enabled > starting network > pfctl: pfctl_rules > pfctl: DIOCXROLLBACK: Invalid argument > reordering libraries:panic: acquiring blockable sleep lock with spinlock or > critical section held (kernel_lock) _lock @ > /usr/src/sys-realclean/arch/amd64/amd64/intr.c:525 > Stopped at db_enter+0x5: popq%rbp > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > *522028 67277 0 0x14000 0x2000 reaper > db_enter() at db_enter+0x5 > panic() at panic+0x138 > witness_checkorder(81b7c59c,20d,0,81cf7ca0,8002af00) > at > witness_checkorder+0xd32 > ___mp_lock(8002af00,8e0eaca0,81bdaff0) at > ___mp_lock+0x > 70 > intr_handler(1,8002ae80) at intr_handler+0x40 > Xintr_legacy8_untramp(8e0ead60,81d16c60,c,10,8e0ead30,f > fff814562c0) at Xintr_legacy8_untramp+0x155 > Xspllower(0,282,818c9e53,1ca9c,ff000257,10) at Xspllower+0xc > uvm_pmr_freepages(1f12000,ff001f75e380) at uvm_pmr_freepages+0x204 > pmap_do_remove(ff001bd30a18,ff001f75f5a0,8e0ab4d0,81053 > c20) at pmap_do_remove+0x463 > uvm_map_teardown(0) at uvm_map_teardown+0x143 > uvmspace_free(8e0f9148) at uvmspace_free+0x36 > uvm_exit(8e0f9148) at uvm_exit+0x16 > reaper() at reaper+0x156 > end trace frame: 0x0, count: 2 > https://www.openbsd.org/ddb.html describes the minimum info required in bug > reports. Insufficient info makes it difficult to find and fix bugs. > ddb{0}> > ddb{0}> show locks > exclusive mutex r = 0 (0x81d1bcc0) locked @ > /usr/src/sy > s-realclean/uvm/uvm_pmemrange.c:1124 > ddb{0}> show witness > Sleep locks: > sysctllk (type: rwlock, depth: 0) -- last acquired @ > /usr/src/sys-realclean/ker > n/kern_sysctl.c:233 > >lock (type: rwlock, depth: 2) -- last acquired @ > /usr/src/sys-realclean/ > uvm/uvm_map.c:1936 > netlock (type:
bsd.mp hits witness panic under vmm (single CPU)
The GENERIC bsd kernel is happy under vmm, but booting a GENERIC.MP kernel hits a witness panic. I suspect some "one CPU only" optimization is resulting in the witness code being misinformed. Here's the boot output in the vmm console. (Yes, the userland is out of date, but that shouldn't lead to a witness panic either.) (The weird "show witness" output for scsi_base.c mutexes is because they're on the stack and need to be unlinked from witness before returning; that *might* be causing the problem here, but I doubt it. I'm starting on a diff for that part...) Philip Guenther --- Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. Copyright (c) 1995-2018 OpenBSD. All rights reserved. https://www.OpenBSD.org OpenBSD 6.3-current (GENERIC.MP) #25: Thu Jun 7 16:29:55 PDT 2018 guenther@morgaine.local:/usr/src/sys-realclean/arch/amd64/compile/GENERIC.MP real mem = 520093696 (496MB) avail mem = 485457920 (462MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0 acpi at bios0 not configured cpu0 at mainbus0: (uniprocessor) cpu0: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz, 2594.54 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,RDSEED,ADX,SMAP,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 pvbus0 at mainbus0: OpenBSD pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00 virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00 viornd0 at virtio0 virtio0: irq 3 virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Storage" rev 0x00 vioblk0 at virtio1 scsibus1 at vioblk0: 2 targets sd0 at scsibus1 targ 0 lun 0: SCSI3 0/direct fixed sd0: 4096MB, 512 bytes/sector, 8388608 sectors virtio1: irq 5 virtio2 at pci0 dev 3 function 0 "OpenBSD VMM Control" rev 0x00 vmmci0 at virtio2 virtio2: irq 6 isa0 at mainbus0 isadma0 at isa0 com0 at isa0 port 0x3f8/8 irq 4: ns16450, no fifo com0: console vscsi0 at root scsibus2 at vscsi0: 256 targets softraid0 at root scsibus3 at softraid0: 256 targets root on sd0a (0084d990f4e53393.a) swap on sd0b dump on sd0b Automatic boot in progress: starting file system checks. /dev/sd0a (0084d990f4e53393.a): file system is clean; not checking /dev/sd0e (0084d990f4e53393.e): file system is clean; not checking /dev/sd0d (0084d990f4e53393.d): file system is clean; not checking setting tty flags pfctl: pfctl_rules pfctl: DIOCXROLLBACK: Invalid argument pf enabled starting network pfctl: pfctl_rules pfctl: DIOCXROLLBACK: Invalid argument reordering libraries:panic: acquiring blockable sleep lock with spinlock or critical section held (kernel_lock) _lock @ /usr/src/sys-realclean/arch/amd64/amd64/intr.c:525 Stopped at db_enter+0x5: popq%rbp TIDPIDUID PRFLAGS PFLAGS CPU COMMAND *522028 67277 0 0x14000 0x2000 reaper db_enter() at db_enter+0x5 panic() at panic+0x138 witness_checkorder(81b7c59c,20d,0,81cf7ca0,8002af00) at witness_checkorder+0xd32 ___mp_lock(8002af00,8e0eaca0,81bdaff0) at ___mp_lock+0x 70 intr_handler(1,8002ae80) at intr_handler+0x40 Xintr_legacy8_untramp(8e0ead60,81d16c60,c,10,8e0ead30,f fff814562c0) at Xintr_legacy8_untramp+0x155 Xspllower(0,282,818c9e53,1ca9c,ff000257,10) at Xspllower+0xc uvm_pmr_freepages(1f12000,ff001f75e380) at uvm_pmr_freepages+0x204 pmap_do_remove(ff001bd30a18,ff001f75f5a0,8e0ab4d0,81053 c20) at pmap_do_remove+0x463 uvm_map_teardown(0) at uvm_map_teardown+0x143 uvmspace_free(8e0f9148) at uvmspace_free+0x36 uvm_exit(8e0f9148) at uvm_exit+0x16 reaper() at reaper+0x156 end trace frame: 0x0, count: 2 https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{0}> ddb{0}> show locks exclusive mutex r = 0 (0x81d1bcc0) locked @ /usr/src/sy s-realclean/uvm/uvm_pmemrange.c:1124 ddb{0}> show witness Sleep locks: sysctllk (type: rwlock, depth: 0) -- last acquired @ /usr/src/sys-realclean/ker n/kern_sysctl.c:233 >lock (type: rwlock, depth: 2) -- last acquired @ /usr/src/sys-realclean/ uvm/uvm_map.c:1936 netlock (type: rwlock, depth: 1) -- last acquired @ /usr/src/sys-realclean/net inet/igmp.c:609 pools (type: rwlock, depth: 2) -- last acquired @ /usr/src/sys-realclean/kern /subr_pool.c:474 >ar_lock (type: rwlock, depth: 2) -- last acquired @ /usr/src/sys-realcle an/net/rtable.c:500 swplk (type: rwlock, depth: 0) -- last acquired @ /usr/src/sys-realclean/uvm/uv m_swap.c:615 >i_lock (type: rrwlock, depth: 1) -- last acquired @ /usr/src/sys-realclea n/ufs/ufs/ufs_vnops.c:1559 >lock (type: rwlock,
Re: apmd(8): poll timer miscalculation
On Thu, Jun 07, 2018 at 11:27:34AM +, sunil+b...@nimmagadda.net wrote: > >Synopsis:apmd(8) poll timer off by 10x > >Category:system > >Environment: > System : OpenBSD 6.3 > Details : OpenBSD 6.3-current (GENERIC.MP) #54: Wed May 30 23:03:50 > MDT 2018 > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > Architecture: OpenBSD.amd64 > Machine : amd64 > >Description: > With apmd_flags="-Az10", expected apmd(8) to suspend when > battery is at 10%, however, it didn't check in time and > laptop ran of out power. > >How-To-Repeat: > Disconnect A/C adapter and run with -z percent greater > than current estimated battery life reported by apm(8); > poll every minute, for example... > # rcctl stop apmd > # apmd -A -z90 -t60 > should suspend in a minute, however, it suspends after 10 > minutes. > >Fix: > The following diff... > > 1. Provides a dedicated timer that fires every 10 seconds > instead of relying on EVFILT_READ freqency. > > 2. Increments a counter and checks against timeout value, > if it exceeds, invokes auto-action. > > 3. Wraps a few long lines that exceed 80 cols upon code > shuffle. I don't think we need to introduce an additional timer, or do so much code shufflin' here, to fix your issue. The problem seems to be that apmtimeout is incremented once per iteration but must meet or exceed ts.tv_sec to trigger a status check, so the period for battery status checks is at least n^2 seconds. I'm pretty sure this was unintentional, though it makes it unlikely that apmd will catch a low battery percentage and suspend the machine before the battery is totally exhausted. Especially since, by default, n = 600. Here's a minimal diff that checks if we timed out on return from kevent. There's additional cleanup that this change implies, but I've left it out for the moment. Of note is that an event for either of the descriptors resets the timeout, regardless of how long it's been since we checked the battery status: this is effectively the current behavior. If people want, we can add logic to decrement the maximum timeout accordingly on each iteration and reset it when kevent truly times out. This sounds closer to what the manpage advertises for the -t option. Caveat: I'm unfamiliar with apmd(8) and I don't have time just this second scour the change log to figure out why the behavior is what it is now. Someone more familiar with the code will need to corroborate what I've said and the attached diff. That said, feel free to try this diff in the meantime. Does this work for you? Anyone more familiar with apmd(8) wanna chime in here? -- Scott Cheloha Index: usr.sbin/apmd/apmd.c === RCS file: /cvs/src/usr.sbin/apmd/apmd.c,v retrieving revision 1.81 diff -u -p -r1.81 apmd.c --- usr.sbin/apmd/apmd.c15 Oct 2017 15:14:49 - 1.81 +++ usr.sbin/apmd/apmd.c7 Jun 2018 20:26:11 - @@ -488,7 +488,7 @@ main(int argc, char *argv[]) if ((rv = kevent(kq, NULL, 0, ev, 1, )) < 0) break; - if (apmtimeout >= ts.tv_sec) { + if (rv == 0) { apmtimeout = 0; /* wakeup for timeout: take status */
Re: booting after hibernate = reboot
Philip Guenther writes: > On Thu, 7 Jun 2018, Solene Rapenne wrote: >> >Synopsis: booting after hibernating loads and reboot > ... >> Details : OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun 5 19:22:09 >> MDT 2018 >> >> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > ... >> When using ZZZ to hibernate, I can see that it writes to disk as >> usual, then the computer shut down. Everything is alright. At >> boot, it uses /bsd.booted, the console display the usual loading >> screen, and when it comes to the line looking like >> "unhibernate @block" (it displays too fast), then the screen >> goes black and after a few second, the computer reboot. > > Can you find and report the "OpenBSD 6.etc" log line in /var/log/messages* > from the previous kernel where hibernate + resume worked correctly? > That'll narrow down when the regression occurred. > > Depending on how long a range of time+builds that is, there are various > strategies for identifying the source of the failure... > > > Philip Guenther The previous kernel was OpenBSD 6.3-current (GENERIC.MP) #43: Mon May 21 16:30:33 MDT 2018 but I found this in /var/log/messages, for each reboot after unhibernate Jun 7 17:21:51 t400 syslogd[90658]: start Jun 7 17:21:51 t400 /bsd: OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun 5 19:22:09 MDT 2018 Jun 7 17:21:51 t400 /bsd: dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Jun 7 17:21:51 t400 /bsd: real mem = 4168814592 (3975MB) Jun 7 17:21:51 t400 /bsd: avail mem = 3995996160 (3810MB) Jun 7 17:21:51 t400 /bsd: mpath0 at root Jun 7 17:21:51 t400 /bsd: scsibus0 at mpath0: 256 targets Jun 7 17:21:51 t400 /bsd: mainbus0 at root Jun 7 17:21:51 t400 /bsd: bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xe0010 (74 entries) Jun 7 17:21:51 t400 /bsd: bios0: vendor LENOVO version "7UET66WW (2.16 )" date 04/22/2009 Jun 7 17:21:51 t400 /bsd: bios0: LENOVO 2768V8S Jun 7 17:21:51 t400 /bsd: acpi0 at bios0: rev 2 Jun 7 17:21:51 t400 /bsd: acpi0: sleep states S0 S3 S4 S5 Jun 7 17:21:51 t400 /bsd: acpi0: tables DSDT FACP SSDT ECDT APIC MCFG HPET BOOT ASF! SSDT TCPA SSDT SSDT SSDT Jun 7 17:21:51 t400 /bsd: acpi0: wakeup devices LID_(S3) SLPB(S3) IGBE(S4) EXP0(S4) EXP1(S4) EXP2(S4) EXP3(S4) EXP4(S4) PCI1(S4) USB0(S3) USB3(S3) USB5(S3) EHC0(S3) EHC1(S3) HDEF(S4) Jun 7 17:21:51 t400 /bsd: acpitimer0 at acpi0: 3579545 Hz, 24 bits Jun 7 17:21:51 t400 /bsd: acpiec0 at acpi0 Jun 7 17:21:51 t400 /bsd: acpimadt0 at acpi0 addr 0xfee0: PC-AT compat Jun 7 17:21:51 t400 /bsd: cpu0 at mainbus0: apid 0 (boot processor) Jun 7 17:21:51 t400 /bsd: cpu0: Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz, 2527.50 MHz Jun 7 17:21:51 t400 /bsd: cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN Jun 7 17:21:51 t400 /bsd: cpu0: 6MB 64b/line 16-way L2 cache Jun 7 17:21:51 t400 /bsd: cpu0: smt 0, core 0, package 0 Jun 7 17:21:51 t400 /bsd: mtrr: Pentium Pro MTRR support, 7 var ranges, 88 fixed ranges Jun 7 17:21:51 t400 /bsd: using xsave Jun 7 17:21:51 t400 /bsd: cpu0: apic clock running at 265MHz Jun 7 17:21:51 t400 /bsd: cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2.1.3, IBE Jun 7 17:21:51 t400 /bsd: cpu1 at mainbus0: apid 1 (application processor) Jun 7 17:21:51 t400 sendsyslog: dropped 1 message, error 57, pid 23145 Jun 7 17:21:51 t400 unbound: [38185:0] notice: init module 0: validator Jun 7 17:21:51 t400 unbound: [38185:0] notice: init module 1: iterator Jun 7 17:21:52 t400 savecore: no core dump Jun 7 17:21:54 t400 apmd: battery status: high. external power status: connected. estimated battery life 100% Jun 7 17:21:59 t400 /bsd: lock order reversal: Jun 7 17:21:59 t400 /bsd: 1st 0xff011adb9460 vmmaplk (>lock) @ /usr/src/sys/uvm/uvm_fault.c:1441 Jun 7 17:21:59 t400 /bsd: 2nd 0x80104138 drmdevlk (>struct_mutex) @ /usr/src/sys/dev/pci/drm/i915/i915_gem.c:1801 Jun 7 17:21:59 t400 /bsd: lock order ">struct_mutex"(rwlock) -> ">lock"(rwlock) first seen at: Jun 7 17:21:59 t400 /bsd: #0 witness_checkorder+0x4b4 Jun 7 17:21:59 t400 /bsd: #1 _rw_enter+0x68 Jun 7 17:21:59 t400 /bsd: #2 vm_map_lock_ln+0xbc Jun 7 17:21:59 t400 /bsd: #3 uvm_map+0x1a1 Jun 7 17:21:59 t400 /bsd: #4 km_alloc+0x16a Jun 7 17:21:59 t400 /bsd: #5 bus_space_map+0x159 Jun 7 17:21:59 t400 /bsd: #6 i965_alloc_ifp+0xc1 Jun 7 17:21:59 t400 /bsd: #7 intel_gtt_chipset_setup+0x1b1 Jun 7 17:21:59 t400 /bsd: #8 intel_enable_gtt+0x26 Jun 7 17:21:59 t400 /bsd: #9 i915_gem_init_hw+0x43 Jun 7 17:21:59 t400 /bsd: #10 i915_gem_init+0x24e Jun 7 17:21:59 t400 /bsd: #11 i915_driver_load+0xfc1 Jun 7 17:21:59 t400 /bsd: #12 inteldrm_attach+0x37f Jun 7 17:21:59 t400 /bsd: #13 config_attach+0x20e Jun 7 17:21:59 t400 /bsd: #14
Re: booting after hibernate = reboot
On Thu, Jun 07, 2018 at 11:23:20AM -0700, Philip Guenther wrote: > On Thu, 7 Jun 2018, Solene Rapenne wrote: > > >Synopsis: booting after hibernating loads and reboot > ... > > Details : OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun 5 19:22:09 > > MDT 2018 > > > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > ... > > When using ZZZ to hibernate, I can see that it writes to disk as > > usual, then the computer shut down. Everything is alright. At > > boot, it uses /bsd.booted, the console display the usual loading > > screen, and when it comes to the line looking like > > "unhibernate @block" (it displays too fast), then the screen > > goes black and after a few second, the computer reboot. > > Can you find and report the "OpenBSD 6.etc" log line in /var/log/messages* > from the previous kernel where hibernate + resume worked correctly? > That'll narrow down when the regression occurred. > > Depending on how long a range of time+builds that is, there are various > strategies for identifying the source of the failure... Perhaps Solene can do better than this, but I tested jsing's softraid unhibernate diff on June 2nd around noon UTC with a freshly updated cvs tree. At that point hibernate definitely worked.
Re: booting after hibernate = reboot
On Thu, 7 Jun 2018, Solene Rapenne wrote: > >Synopsis:booting after hibernating loads and reboot ... > Details : OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun 5 19:22:09 > MDT 2018 > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP ... > When using ZZZ to hibernate, I can see that it writes to disk as > usual, then the computer shut down. Everything is alright. At > boot, it uses /bsd.booted, the console display the usual loading > screen, and when it comes to the line looking like > "unhibernate @block" (it displays too fast), then the screen > goes black and after a few second, the computer reboot. Can you find and report the "OpenBSD 6.etc" log line in /var/log/messages* from the previous kernel where hibernate + resume worked correctly? That'll narrow down when the regression occurred. Depending on how long a range of time+builds that is, there are various strategies for identifying the source of the failure... Philip Guenther
booting after hibernate = reboot
>Synopsis: booting after hibernating loads and reboot >Category: kernel >Environment: System : OpenBSD 6.3 Details : OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun 5 19:22:09 MDT 2018 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Architecture: OpenBSD.amd64 Machine : amd64 >Description: When using ZZZ to hibernate, I can see that it writes to disk as usual, then the computer shut down. Everything is alright. At boot, it uses /bsd.booted, the console display the usual loading screen, and when it comes to the line looking like "unhibernate @block" (it displays too fast), then the screen goes black and after a few second, the computer reboot. After that reboot, the usual /bsd is used as a default, and FTIW there is no need to fsck the drives. >How-To-Repeat: ZZZ power on the computer and wait until it reboots >Fix: dmesg: OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun 5 19:22:09 MDT 2018 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 4168814592 (3975MB) avail mem = 3996000256 (3810MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xe0010 (74 entries) bios0: vendor LENOVO version "7UET66WW (2.16 )" date 04/22/2009 bios0: LENOVO 2768V8S acpi0 at bios0: rev 2 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SSDT ECDT APIC MCFG HPET BOOT ASF! SSDT TCPA SSDT SSDT SSDT acpi0: wakeup devices LID_(S3) SLPB(S3) IGBE(S4) EXP0(S4) EXP1(S4) EXP2(S4) EXP3(S4) EXP4(S4) PCI1(S4) USB0(S3) USB3(S3) USB5(S3) EHC0(S3) EHC1(S3) HDEF(S4) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpiec0 at acpi0 acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz, 2527.45 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN cpu0: 6MB 64b/line 16-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 7 var ranges, 88 fixed ranges using xsave cpu0: apic clock running at 266MHz cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2.1.3, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz, 2527.01 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN cpu1: 6MB 64b/line 16-way L2 cache cpu1: smt 0, core 1, package 0 ioapic0 at mainbus0: apid 1 pa 0xfec0, version 20, 24 pins , remapped to apid 1 acpimcfg0 at acpi0 addr 0xe000, bus 0-63 acpihpet0 at acpi0: 14318179 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus -1 (AGP_) acpiprt2 at acpi0: bus 2 (EXP0) acpiprt3 at acpi0: bus 3 (EXP1) acpiprt4 at acpi0: bus -1 (EXP2) acpiprt5 at acpi0: bus -1 (EXP3) acpiprt6 at acpi0: bus -1 (EXP4) acpiprt7 at acpi0: bus 21 (PCI1) acpicpu0 at acpi0: !C2(500@1 mwait.1@0x10), C1(1000@1 mwait.1), PSS acpicpu1 at acpi0: !C2(500@1 mwait.1@0x10), C1(1000@1 mwait.1), PSS acpipwrres0 at acpi0: PUBS, resource for USB0, USB3, USB5, EHC0, EHC1 acpitz0 at acpi0: critical temperature is 127 degC acpitz1 at acpi0: critical temperature is 100 degC acpibtn0 at acpi0: LID_ acpibtn1 at acpi0: SLPB acpicmos0 at acpi0 "IBM0057" at acpi0 not configured "INTC0102" at acpi0 not configured acpibat0 at acpi0: BAT0 model "42T4645" serial 793 type LION oem "Panasonic" acpiac0 at acpi0: AC unit online acpithinkpad0 at acpi0 "PNP0C14" at acpi0 not configured acpidock0 at acpi0: GDCK not docked (0) acpivideo0 at acpi0: VID_ acpivout0 at acpivideo0: LCD0 acpivideo1 at acpi0: VID_ cpu0: Enhanced SpeedStep 2527 MHz: speeds: 2534, 2533, 1600, 800 MHz pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "Intel GM45 Host" rev 0x07 inteldrm0 at pci0 dev 2 function 0 "Intel GM45 Video" rev 0x07 drm0 at inteldrm0 intagp0 at inteldrm0 agp0 at intagp0: aperture at 0xd000, size 0x1000 inteldrm0: msi inteldrm0: 1440x900, 32bpp wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation) wsdisplay0: screen 1-5 added (std, vt100 emulation) "Intel GM45 Video" rev 0x07 at pci0 dev 2 function 1 not configured "Intel GM45 HECI" rev 0x07 at pci0 dev 3 function 0 not configured pciide0 at pci0 dev 3 function 2 "Intel GM45 PT IDER" rev 0x07: DMA (unsupported), channel 0 wired to native-PCI, channel 1 wired to native-PCI pciide0: using apic 1 int 18 for native-PCI interrupt pciide0: channel 0 ignored (not responding; disabled or no drives?) pciide0: channel 1 ignored (not responding; disabled or no drives?) puc0 at pci0 dev 3 function 3 "Intel GM45 KT" rev 0x07: ports: 16 com com4 at
apmd(8): poll timer miscalculation
>Synopsis: apmd(8) poll timer off by 10x >Category: system >Environment: System : OpenBSD 6.3 Details : OpenBSD 6.3-current (GENERIC.MP) #54: Wed May 30 23:03:50 MDT 2018 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Architecture: OpenBSD.amd64 Machine : amd64 >Description: With apmd_flags="-Az10", expected apmd(8) to suspend when battery is at 10%, however, it didn't check in time and laptop ran of out power. >How-To-Repeat: Disconnect A/C adapter and run with -z percent greater than current estimated battery life reported by apm(8); poll every minute, for example... # rcctl stop apmd # apmd -A -z90 -t60 should suspend in a minute, however, it suspends after 10 minutes. >Fix: The following diff... 1. Provides a dedicated timer that fires every 10 seconds instead of relying on EVFILT_READ freqency. 2. Increments a counter and checks against timeout value, if it exceeds, invokes auto-action. 3. Wraps a few long lines that exceed 80 cols upon code shuffle. Index: apmd.c === RCS file: /cvs/src/usr.sbin/apmd/apmd.c,v retrieving revision 1.81 diff -u -p -w -r1.81 apmd.c --- apmd.c 15 Oct 2017 15:14:49 - 1.81 +++ apmd.c 7 Jun 2018 10:56:06 - @@ -56,6 +56,9 @@ #define AUTO_SUSPEND 1 #define AUTO_HIBERNATE 2 +#define TIMO (10*60) /* 10 minutes */ +#define TIMER_ID 1 + const char apmdev[] = _PATH_APM_CTLDEV; const char sockfile[] = _PATH_APM_SOCKET; @@ -341,8 +344,6 @@ hibernate(int ctl_fd) ioctl(ctl_fd, APM_IOC_HIBERNATE, 0); } -#define TIMO (10*60) /* 10 minutes */ - int main(int argc, char *argv[]) { @@ -353,13 +354,12 @@ main(int argc, char *argv[]) int statonly = 0; int powerstatus = 0, powerbak = 0, powerchange = 0; int noacsleep = 0; - struct timespec ts = {TIMO, 0}, sts = {0, 0}; struct apm_power_info pinfo; - time_t apmtimeout = 0; + time_t apmtimeout = TIMO, counter = 0; const char *sockname = sockfile; const char *errstr; int kq, nchanges; - struct kevent ev[2]; + struct kevent ev[3]; int ncpu_mib[2] = { CTL_HW, HW_NCPU }; int ncpu; size_t ncpu_sz = sizeof(ncpu); @@ -379,8 +379,8 @@ main(int argc, char *argv[]) sockname = optarg; break; case 't': - ts.tv_sec = strtoul(optarg, NULL, 0); - if (ts.tv_sec == 0) + apmtimeout = strtoul(optarg, NULL, 0); + if (apmtimeout == 0) usage(); break; case 's': /* status only */ @@ -466,14 +466,15 @@ main(int argc, char *argv[]) EV_SET([0], sock_fd, EVFILT_READ, EV_ADD | EV_ENABLE | EV_CLEAR, 0, 0, NULL); + EV_SET([1], TIMER_ID, EVFILT_TIMER, EV_ADD, 0, 1, NULL); if (ctl_fd == -1) - nchanges = 1; + nchanges = 2; else { - EV_SET([1], ctl_fd, EVFILT_READ, EV_ADD | EV_ENABLE | + EV_SET([2], ctl_fd, EVFILT_READ, EV_ADD | EV_ENABLE | EV_CLEAR, 0, 0, NULL); - nchanges = 2; + nchanges = 3; } - if (kevent(kq, ev, nchanges, NULL, 0, ) < 0) + if (kevent(kq, ev, nchanges, NULL, 0, NULL) < 0) error("kevent", NULL); if (sysctl(ncpu_mib, 2, , _sz, NULL, 0) < 0) @@ -482,41 +483,14 @@ main(int argc, char *argv[]) for (;;) { int rv; - sts = ts; - - apmtimeout += 1; - if ((rv = kevent(kq, NULL, 0, ev, 1, )) < 0) + if ((rv = kevent(kq, NULL, 0, ev, 1, NULL)) < 0) break; - if (apmtimeout >= ts.tv_sec) { - apmtimeout = 0; - - /* wakeup for timeout: take status */ - powerbak = power_status(ctl_fd, 0, ); - if (powerstatus != powerbak) { - powerstatus = powerbak; - powerchange = 1; - } - - if (!powerstatus && autoaction && - autolimit > (int)pinfo.battery_life) { - syslog(LOG_NOTICE, - "estimated battery life %d%%, " - "autoaction limit set to %d%% .", - pinfo.battery_life, - autolimit - ); - - if (autoaction == AUTO_SUSPEND)