Re: bsd.mp hits witness panic under vmm (single CPU)

2018-06-07 Thread Mike Larkin
On Thu, Jun 07, 2018 at 07:22:23PM -0700, Philip Guenther wrote:
> On Thu, 7 Jun 2018, Mike Larkin wrote:
> > Is this a panic inside the guest in vmm, or is this the host panicing when
> > you're doing something while a VM is running in vmm on that host?
> > 
> > Can't really tell from the trace here...
> 
> This was a guest panicing.  visa@ thinks this is the same intr_legacy8 
> panic as reported previously.
> 

Could be!



Re: bsd.mp hits witness panic under vmm (single CPU)

2018-06-07 Thread Philip Guenther
On Thu, 7 Jun 2018, Mike Larkin wrote:
> Is this a panic inside the guest in vmm, or is this the host panicing when
> you're doing something while a VM is running in vmm on that host?
> 
> Can't really tell from the trace here...

This was a guest panicing.  visa@ thinks this is the same intr_legacy8 
panic as reported previously.



Re: bsd.mp hits witness panic under vmm (single CPU)

2018-06-07 Thread Mike Larkin
On Thu, Jun 07, 2018 at 05:13:06PM -0700, Philip Guenther wrote:
> 
> The GENERIC bsd kernel is happy under vmm, but booting a GENERIC.MP kernel 
> hits a witness panic.  I suspect some "one CPU only" optimization is 
> resulting in the witness code being misinformed.
> 
> Here's the boot output in the vmm console.  (Yes, the userland is out of 
> date, but that shouldn't lead to a witness panic either.)
> 
> 
> (The weird "show witness" output for scsi_base.c mutexes is because 
> they're on the stack and need to be unlinked from witness before 
> returning; that *might* be causing the problem here, but I doubt it.  I'm 
> starting on a diff for that part...)
> 
> 
> Philip Guenther
> 

Is this a panic inside the guest in vmm, or is this the host panicing when
you're doing something while a VM is running in vmm on that host?

Can't really tell from the trace here...

-ml

> ---
> Copyright (c) 1982, 1986, 1989, 1991, 1993
> The Regents of the University of California.  All rights reserved.
> Copyright (c) 1995-2018 OpenBSD. All rights reserved.  https://www.OpenBSD.org
> 
> OpenBSD 6.3-current (GENERIC.MP) #25: Thu Jun  7 16:29:55 PDT 2018
> 
> guenther@morgaine.local:/usr/src/sys-realclean/arch/amd64/compile/GENERIC.MP
> real mem = 520093696 (496MB)
> avail mem = 485457920 (462MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0
> acpi at bios0 not configured
> cpu0 at mainbus0: (uniprocessor)
> cpu0: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz, 2594.54 MHz
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,RDSEED,ADX,SMAP,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> pvbus0 at mainbus0: OpenBSD
> pci0 at mainbus0 bus 0
> pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00
> virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00
> viornd0 at virtio0
> virtio0: irq 3
> virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Storage" rev 0x00
> vioblk0 at virtio1
> scsibus1 at vioblk0: 2 targets
> sd0 at scsibus1 targ 0 lun 0:  SCSI3 0/direct fixed
> sd0: 4096MB, 512 bytes/sector, 8388608 sectors
> virtio1: irq 5
> virtio2 at pci0 dev 3 function 0 "OpenBSD VMM Control" rev 0x00
> vmmci0 at virtio2
> virtio2: irq 6
> isa0 at mainbus0
> isadma0 at isa0
> com0 at isa0 port 0x3f8/8 irq 4: ns16450, no fifo
> com0: console
> vscsi0 at root
> scsibus2 at vscsi0: 256 targets
> softraid0 at root
> scsibus3 at softraid0: 256 targets
> root on sd0a (0084d990f4e53393.a) swap on sd0b dump on sd0b
> Automatic boot in progress: starting file system checks.
> /dev/sd0a (0084d990f4e53393.a): file system is clean; not checking
> /dev/sd0e (0084d990f4e53393.e): file system is clean; not checking
> /dev/sd0d (0084d990f4e53393.d): file system is clean; not checking
> setting tty flags
> pfctl: pfctl_rules
> pfctl: DIOCXROLLBACK: Invalid argument
> pf enabled
> starting network
> pfctl: pfctl_rules
> pfctl: DIOCXROLLBACK: Invalid argument
> reordering libraries:panic: acquiring blockable sleep lock with spinlock or 
> critical section held (kernel_lock) _lock @ 
> /usr/src/sys-realclean/arch/amd64/amd64/intr.c:525
> Stopped at  db_enter+0x5:   popq%rbp
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> *522028  67277  0 0x14000  0x2000  reaper
> db_enter() at db_enter+0x5
> panic() at panic+0x138
> witness_checkorder(81b7c59c,20d,0,81cf7ca0,8002af00) 
> at
>  witness_checkorder+0xd32
> ___mp_lock(8002af00,8e0eaca0,81bdaff0) at 
> ___mp_lock+0x
> 70
> intr_handler(1,8002ae80) at intr_handler+0x40
> Xintr_legacy8_untramp(8e0ead60,81d16c60,c,10,8e0ead30,f
> fff814562c0) at Xintr_legacy8_untramp+0x155
> Xspllower(0,282,818c9e53,1ca9c,ff000257,10) at Xspllower+0xc
> uvm_pmr_freepages(1f12000,ff001f75e380) at uvm_pmr_freepages+0x204
> pmap_do_remove(ff001bd30a18,ff001f75f5a0,8e0ab4d0,81053
> c20) at pmap_do_remove+0x463
> uvm_map_teardown(0) at uvm_map_teardown+0x143
> uvmspace_free(8e0f9148) at uvmspace_free+0x36
> uvm_exit(8e0f9148) at uvm_exit+0x16
> reaper() at reaper+0x156
> end trace frame: 0x0, count: 2
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{0}>
> ddb{0}> show locks
> exclusive mutex  r = 0 (0x81d1bcc0) locked @ 
> /usr/src/sy
> s-realclean/uvm/uvm_pmemrange.c:1124
> ddb{0}> show witness
> Sleep locks:
> sysctllk (type: rwlock, depth: 0) -- last acquired @ 
> /usr/src/sys-realclean/ker
> n/kern_sysctl.c:233
>  >lock (type: rwlock, depth: 2) -- last acquired @ 
> /usr/src/sys-realclean/
> uvm/uvm_map.c:1936
>  netlock (type: 

bsd.mp hits witness panic under vmm (single CPU)

2018-06-07 Thread Philip Guenther


The GENERIC bsd kernel is happy under vmm, but booting a GENERIC.MP kernel 
hits a witness panic.  I suspect some "one CPU only" optimization is 
resulting in the witness code being misinformed.

Here's the boot output in the vmm console.  (Yes, the userland is out of 
date, but that shouldn't lead to a witness panic either.)


(The weird "show witness" output for scsi_base.c mutexes is because 
they're on the stack and need to be unlinked from witness before 
returning; that *might* be causing the problem here, but I doubt it.  I'm 
starting on a diff for that part...)


Philip Guenther

---
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2018 OpenBSD. All rights reserved.  https://www.OpenBSD.org

OpenBSD 6.3-current (GENERIC.MP) #25: Thu Jun  7 16:29:55 PDT 2018
guenther@morgaine.local:/usr/src/sys-realclean/arch/amd64/compile/GENERIC.MP
real mem = 520093696 (496MB)
avail mem = 485457920 (462MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0
acpi at bios0 not configured
cpu0 at mainbus0: (uniprocessor)
cpu0: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz, 2594.54 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,RDSEED,ADX,SMAP,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
pvbus0 at mainbus0: OpenBSD
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00
virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00
viornd0 at virtio0
virtio0: irq 3
virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Storage" rev 0x00
vioblk0 at virtio1
scsibus1 at vioblk0: 2 targets
sd0 at scsibus1 targ 0 lun 0:  SCSI3 0/direct fixed
sd0: 4096MB, 512 bytes/sector, 8388608 sectors
virtio1: irq 5
virtio2 at pci0 dev 3 function 0 "OpenBSD VMM Control" rev 0x00
vmmci0 at virtio2
virtio2: irq 6
isa0 at mainbus0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16450, no fifo
com0: console
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
root on sd0a (0084d990f4e53393.a) swap on sd0b dump on sd0b
Automatic boot in progress: starting file system checks.
/dev/sd0a (0084d990f4e53393.a): file system is clean; not checking
/dev/sd0e (0084d990f4e53393.e): file system is clean; not checking
/dev/sd0d (0084d990f4e53393.d): file system is clean; not checking
setting tty flags
pfctl: pfctl_rules
pfctl: DIOCXROLLBACK: Invalid argument
pf enabled
starting network
pfctl: pfctl_rules
pfctl: DIOCXROLLBACK: Invalid argument
reordering libraries:panic: acquiring blockable sleep lock with spinlock or 
critical section held (kernel_lock) _lock @ 
/usr/src/sys-realclean/arch/amd64/amd64/intr.c:525
Stopped at  db_enter+0x5:   popq%rbp
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
*522028  67277  0 0x14000  0x2000  reaper
db_enter() at db_enter+0x5
panic() at panic+0x138
witness_checkorder(81b7c59c,20d,0,81cf7ca0,8002af00) at
 witness_checkorder+0xd32
___mp_lock(8002af00,8e0eaca0,81bdaff0) at ___mp_lock+0x
70
intr_handler(1,8002ae80) at intr_handler+0x40
Xintr_legacy8_untramp(8e0ead60,81d16c60,c,10,8e0ead30,f
fff814562c0) at Xintr_legacy8_untramp+0x155
Xspllower(0,282,818c9e53,1ca9c,ff000257,10) at Xspllower+0xc
uvm_pmr_freepages(1f12000,ff001f75e380) at uvm_pmr_freepages+0x204
pmap_do_remove(ff001bd30a18,ff001f75f5a0,8e0ab4d0,81053
c20) at pmap_do_remove+0x463
uvm_map_teardown(0) at uvm_map_teardown+0x143
uvmspace_free(8e0f9148) at uvmspace_free+0x36
uvm_exit(8e0f9148) at uvm_exit+0x16
reaper() at reaper+0x156
end trace frame: 0x0, count: 2
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{0}>
ddb{0}> show locks
exclusive mutex  r = 0 (0x81d1bcc0) locked @ /usr/src/sy
s-realclean/uvm/uvm_pmemrange.c:1124
ddb{0}> show witness
Sleep locks:
sysctllk (type: rwlock, depth: 0) -- last acquired @ /usr/src/sys-realclean/ker
n/kern_sysctl.c:233
 >lock (type: rwlock, depth: 2) -- last acquired @ /usr/src/sys-realclean/
uvm/uvm_map.c:1936
 netlock (type: rwlock, depth: 1) -- last acquired @ /usr/src/sys-realclean/net
inet/igmp.c:609
  pools (type: rwlock, depth: 2) -- last acquired @ /usr/src/sys-realclean/kern
/subr_pool.c:474
  >ar_lock (type: rwlock, depth: 2) -- last acquired @ /usr/src/sys-realcle
an/net/rtable.c:500
swplk (type: rwlock, depth: 0) -- last acquired @ /usr/src/sys-realclean/uvm/uv
m_swap.c:615
 >i_lock (type: rrwlock, depth: 1) -- last acquired @ /usr/src/sys-realclea
n/ufs/ufs/ufs_vnops.c:1559
  >lock (type: rwlock, 

Re: apmd(8): poll timer miscalculation

2018-06-07 Thread Scott Cheloha
On Thu, Jun 07, 2018 at 11:27:34AM +, sunil+b...@nimmagadda.net wrote:
> >Synopsis:apmd(8) poll timer off by 10x
> >Category:system
> >Environment:
>   System  : OpenBSD 6.3
>   Details : OpenBSD 6.3-current (GENERIC.MP) #54: Wed May 30 23:03:50 
> MDT 2018
>
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>   Architecture: OpenBSD.amd64
>   Machine : amd64
> >Description:
> With apmd_flags="-Az10", expected apmd(8) to suspend when
> battery is at 10%, however, it didn't check in time and
> laptop ran of out power.
> >How-To-Repeat:
> Disconnect A/C adapter and run with -z percent greater
> than current estimated battery life reported by apm(8);
> poll every minute, for example...
>   # rcctl stop apmd
>   # apmd -A -z90 -t60
> should suspend in a minute, however, it suspends after 10
> minutes.
> >Fix:
>   The following diff...
> 
> 1. Provides a dedicated timer that fires every 10 seconds
> instead of relying on EVFILT_READ freqency.
> 
> 2. Increments a counter and checks against timeout value,
> if it exceeds, invokes auto-action.
> 
> 3. Wraps a few long lines that exceed 80 cols upon code
> shuffle.

I don't think we need to introduce an additional timer, or do so much
code shufflin' here, to fix your issue.  The problem seems to be that
apmtimeout is incremented once per iteration but must meet or exceed
ts.tv_sec to trigger a status check, so the period for battery status
checks is at least n^2 seconds.

I'm pretty sure this was unintentional, though it makes it unlikely
that apmd will catch a low battery percentage and suspend the machine
before the battery is totally exhausted.  Especially since, by default,
n = 600.

Here's a minimal diff that checks if we timed out on return from kevent.
There's additional cleanup that this change implies, but I've left it out
for the moment.

Of note is that an event for either of the descriptors resets the
timeout, regardless of how long it's been since we checked the battery
status: this is effectively the current behavior.  If people want, we
can add logic to decrement the maximum timeout accordingly on each
iteration and reset it when kevent truly times out.  This sounds closer
to what the manpage advertises for the -t option.

Caveat: I'm unfamiliar with apmd(8) and I don't have time just this
second scour the change log to figure out why the behavior is what it
is now.  Someone more familiar with the code will need to corroborate
what I've said and the attached diff.

That said, feel free to try this diff in the meantime.  Does this work
for you?

Anyone more familiar with apmd(8) wanna chime in here?

--
Scott Cheloha

Index: usr.sbin/apmd/apmd.c
===
RCS file: /cvs/src/usr.sbin/apmd/apmd.c,v
retrieving revision 1.81
diff -u -p -r1.81 apmd.c
--- usr.sbin/apmd/apmd.c15 Oct 2017 15:14:49 -  1.81
+++ usr.sbin/apmd/apmd.c7 Jun 2018 20:26:11 -
@@ -488,7 +488,7 @@ main(int argc, char *argv[])
if ((rv = kevent(kq, NULL, 0, ev, 1, )) < 0)
break;
 
-   if (apmtimeout >= ts.tv_sec) {
+   if (rv == 0) {
apmtimeout = 0;
 
/* wakeup for timeout: take status */



Re: booting after hibernate = reboot

2018-06-07 Thread Solene Rapenne


Philip Guenther writes:

> On Thu, 7 Jun 2018, Solene Rapenne wrote:
>> >Synopsis:   booting after hibernating loads and reboot
> ...
>>  Details : OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun  5 19:22:09 
>> MDT 2018
>>   
>> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> ...
>>  When using ZZZ to hibernate, I can see that it writes to disk as
>>  usual, then the computer shut down. Everything is alright.  At
>>  boot, it uses /bsd.booted, the console display the usual loading
>>  screen, and when it comes to the line looking like
>> "unhibernate @block" (it displays too fast), then the screen
>>  goes black and after a few second, the computer reboot.
>
> Can you find and report the "OpenBSD 6.etc" log line in /var/log/messages* 
> from the previous kernel where hibernate + resume worked correctly?  
> That'll narrow down when the regression occurred.
>
> Depending on how long a range of time+builds that is, there are various 
> strategies for identifying the source of the failure...
>
>
> Philip Guenther

The previous kernel was OpenBSD 6.3-current (GENERIC.MP) #43: Mon May 21 
16:30:33 MDT 2018

but I found this in /var/log/messages, for each reboot after unhibernate

Jun  7 17:21:51 t400 syslogd[90658]: start
Jun  7 17:21:51 t400 /bsd: OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun  5 
19:22:09 MDT 2018
Jun  7 17:21:51 t400 /bsd: 
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
Jun  7 17:21:51 t400 /bsd: real mem = 4168814592 (3975MB)
Jun  7 17:21:51 t400 /bsd: avail mem = 3995996160 (3810MB)
Jun  7 17:21:51 t400 /bsd: mpath0 at root
Jun  7 17:21:51 t400 /bsd: scsibus0 at mpath0: 256 targets
Jun  7 17:21:51 t400 /bsd: mainbus0 at root
Jun  7 17:21:51 t400 /bsd: bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xe0010 (74 
entries)
Jun  7 17:21:51 t400 /bsd: bios0: vendor LENOVO version "7UET66WW (2.16 )" date 
04/22/2009
Jun  7 17:21:51 t400 /bsd: bios0: LENOVO 2768V8S
Jun  7 17:21:51 t400 /bsd: acpi0 at bios0: rev 2
Jun  7 17:21:51 t400 /bsd: acpi0: sleep states S0 S3 S4 S5
Jun  7 17:21:51 t400 /bsd: acpi0: tables DSDT FACP SSDT ECDT APIC MCFG HPET 
BOOT ASF! SSDT TCPA SSDT SSDT SSDT
Jun  7 17:21:51 t400 /bsd: acpi0: wakeup devices LID_(S3) SLPB(S3) IGBE(S4) 
EXP0(S4) EXP1(S4) EXP2(S4) EXP3(S4) EXP4(S4) PCI1(S4) USB0(S3) USB3(S3) 
USB5(S3) EHC0(S3) EHC1(S3) HDEF(S4)
Jun  7 17:21:51 t400 /bsd: acpitimer0 at acpi0: 3579545 Hz, 24 bits
Jun  7 17:21:51 t400 /bsd: acpiec0 at acpi0
Jun  7 17:21:51 t400 /bsd: acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
Jun  7 17:21:51 t400 /bsd: cpu0 at mainbus0: apid 0 (boot processor)
Jun  7 17:21:51 t400 /bsd: cpu0: Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz, 
2527.50 MHz
Jun  7 17:21:51 t400 /bsd: cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN
Jun  7 17:21:51 t400 /bsd: cpu0: 6MB 64b/line 16-way L2 cache
Jun  7 17:21:51 t400 /bsd: cpu0: smt 0, core 0, package 0
Jun  7 17:21:51 t400 /bsd: mtrr: Pentium Pro MTRR support, 7 var ranges, 88 
fixed ranges
Jun  7 17:21:51 t400 /bsd: using xsave
Jun  7 17:21:51 t400 /bsd: cpu0: apic clock running at 265MHz
Jun  7 17:21:51 t400 /bsd: cpu0: mwait min=64, max=64, 
C-substates=0.2.2.2.2.1.3, IBE
Jun  7 17:21:51 t400 /bsd: cpu1 at mainbus0: apid 1 (application processor)
Jun  7 17:21:51 t400 sendsyslog: dropped 1 message, error 57, pid 23145
Jun  7 17:21:51 t400 unbound: [38185:0] notice: init module 0: validator
Jun  7 17:21:51 t400 unbound: [38185:0] notice: init module 1: iterator
Jun  7 17:21:52 t400 savecore: no core dump
Jun  7 17:21:54 t400 apmd: battery status: high. external power status: 
connected. estimated battery life 100%
Jun  7 17:21:59 t400 /bsd: lock order reversal:
Jun  7 17:21:59 t400 /bsd:  1st 0xff011adb9460 vmmaplk (>lock) @ 
/usr/src/sys/uvm/uvm_fault.c:1441
Jun  7 17:21:59 t400 /bsd:  2nd 0x80104138 drmdevlk 
(>struct_mutex) @ /usr/src/sys/dev/pci/drm/i915/i915_gem.c:1801
Jun  7 17:21:59 t400 /bsd: lock order ">struct_mutex"(rwlock) -> 
">lock"(rwlock) first seen at:
Jun  7 17:21:59 t400 /bsd: #0  witness_checkorder+0x4b4
Jun  7 17:21:59 t400 /bsd: #1  _rw_enter+0x68
Jun  7 17:21:59 t400 /bsd: #2  vm_map_lock_ln+0xbc
Jun  7 17:21:59 t400 /bsd: #3  uvm_map+0x1a1
Jun  7 17:21:59 t400 /bsd: #4  km_alloc+0x16a
Jun  7 17:21:59 t400 /bsd: #5  bus_space_map+0x159
Jun  7 17:21:59 t400 /bsd: #6  i965_alloc_ifp+0xc1
Jun  7 17:21:59 t400 /bsd: #7  intel_gtt_chipset_setup+0x1b1
Jun  7 17:21:59 t400 /bsd: #8  intel_enable_gtt+0x26
Jun  7 17:21:59 t400 /bsd: #9  i915_gem_init_hw+0x43
Jun  7 17:21:59 t400 /bsd: #10 i915_gem_init+0x24e
Jun  7 17:21:59 t400 /bsd: #11 i915_driver_load+0xfc1
Jun  7 17:21:59 t400 /bsd: #12 inteldrm_attach+0x37f
Jun  7 17:21:59 t400 /bsd: #13 config_attach+0x20e
Jun  7 17:21:59 t400 /bsd: #14 

Re: booting after hibernate = reboot

2018-06-07 Thread Theo Buehler
On Thu, Jun 07, 2018 at 11:23:20AM -0700, Philip Guenther wrote:
> On Thu, 7 Jun 2018, Solene Rapenne wrote:
> > >Synopsis:  booting after hibernating loads and reboot
> ...
> > Details : OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun  5 19:22:09 
> > MDT 2018
> >  
> > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> ...
> > When using ZZZ to hibernate, I can see that it writes to disk as
> > usual, then the computer shut down. Everything is alright.  At
> > boot, it uses /bsd.booted, the console display the usual loading
> > screen, and when it comes to the line looking like
> > "unhibernate @block" (it displays too fast), then the screen
> > goes black and after a few second, the computer reboot.
> 
> Can you find and report the "OpenBSD 6.etc" log line in /var/log/messages* 
> from the previous kernel where hibernate + resume worked correctly?  
> That'll narrow down when the regression occurred.
> 
> Depending on how long a range of time+builds that is, there are various 
> strategies for identifying the source of the failure...

Perhaps Solene can do better than this, but I tested jsing's softraid
unhibernate diff on June 2nd around noon UTC with a freshly updated cvs
tree. At that point hibernate definitely worked.



Re: booting after hibernate = reboot

2018-06-07 Thread Philip Guenther
On Thu, 7 Jun 2018, Solene Rapenne wrote:
> >Synopsis:booting after hibernating loads and reboot
...
>   Details : OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun  5 19:22:09 
> MDT 2018
>
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
...
>   When using ZZZ to hibernate, I can see that it writes to disk as
>   usual, then the computer shut down. Everything is alright.  At
>   boot, it uses /bsd.booted, the console display the usual loading
>   screen, and when it comes to the line looking like
> "unhibernate @block" (it displays too fast), then the screen
>   goes black and after a few second, the computer reboot.

Can you find and report the "OpenBSD 6.etc" log line in /var/log/messages* 
from the previous kernel where hibernate + resume worked correctly?  
That'll narrow down when the regression occurred.

Depending on how long a range of time+builds that is, there are various 
strategies for identifying the source of the failure...


Philip Guenther



booting after hibernate = reboot

2018-06-07 Thread Solene Rapenne


>Synopsis:  booting after hibernating loads and reboot
>Category:  kernel
>Environment:
System  : OpenBSD 6.3
Details : OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun  5 19:22:09 
MDT 2018
 
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64
>Description:
When using ZZZ to hibernate, I can see that it writes to disk as
usual, then the computer shut down. Everything is alright.  At
boot, it uses /bsd.booted, the console display the usual loading
screen, and when it comes to the line looking like
"unhibernate @block" (it displays too fast), then the screen
goes black and after a few second, the computer reboot.

After that reboot, the usual /bsd is used as a default, and FTIW
there is no need to fsck the drives.

>How-To-Repeat:
ZZZ
power on the computer and wait until it reboots

>Fix:


dmesg:
OpenBSD 6.3-current (GENERIC.MP) #84: Tue Jun  5 19:22:09 MDT 2018
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4168814592 (3975MB)
avail mem = 3996000256 (3810MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xe0010 (74 entries)
bios0: vendor LENOVO version "7UET66WW (2.16 )" date 04/22/2009
bios0: LENOVO 2768V8S
acpi0 at bios0: rev 2
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SSDT ECDT APIC MCFG HPET BOOT ASF! SSDT TCPA SSDT SSDT 
SSDT
acpi0: wakeup devices LID_(S3) SLPB(S3) IGBE(S4) EXP0(S4) EXP1(S4) EXP2(S4) 
EXP3(S4) EXP4(S4) PCI1(S4) USB0(S3) USB3(S3) USB5(S3) EHC0(S3) EHC1(S3) HDEF(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpiec0 at acpi0
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz, 2527.45 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN
cpu0: 6MB 64b/line 16-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 7 var ranges, 88 fixed ranges
using xsave
cpu0: apic clock running at 266MHz
cpu0: mwait min=64, max=64, C-substates=0.2.2.2.2.1.3, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz, 2527.01 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,XSAVE,NXE,LONG,LAHF,PERF,SENSOR,MELTDOWN
cpu1: 6MB 64b/line 16-way L2 cache
cpu1: smt 0, core 1, package 0
ioapic0 at mainbus0: apid 1 pa 0xfec0, version 20, 24 pins
, remapped to apid 1
acpimcfg0 at acpi0 addr 0xe000, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (AGP_)
acpiprt2 at acpi0: bus 2 (EXP0)
acpiprt3 at acpi0: bus 3 (EXP1)
acpiprt4 at acpi0: bus -1 (EXP2)
acpiprt5 at acpi0: bus -1 (EXP3)
acpiprt6 at acpi0: bus -1 (EXP4)
acpiprt7 at acpi0: bus 21 (PCI1)
acpicpu0 at acpi0: !C2(500@1 mwait.1@0x10), C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: !C2(500@1 mwait.1@0x10), C1(1000@1 mwait.1), PSS
acpipwrres0 at acpi0: PUBS, resource for USB0, USB3, USB5, EHC0, EHC1
acpitz0 at acpi0: critical temperature is 127 degC
acpitz1 at acpi0: critical temperature is 100 degC
acpibtn0 at acpi0: LID_
acpibtn1 at acpi0: SLPB
acpicmos0 at acpi0
"IBM0057" at acpi0 not configured
"INTC0102" at acpi0 not configured
acpibat0 at acpi0: BAT0 model "42T4645" serial   793 type LION oem "Panasonic"
acpiac0 at acpi0: AC unit online
acpithinkpad0 at acpi0
"PNP0C14" at acpi0 not configured
acpidock0 at acpi0: GDCK not docked (0)
acpivideo0 at acpi0: VID_
acpivout0 at acpivideo0: LCD0
acpivideo1 at acpi0: VID_
cpu0: Enhanced SpeedStep 2527 MHz: speeds: 2534, 2533, 1600, 800 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel GM45 Host" rev 0x07
inteldrm0 at pci0 dev 2 function 0 "Intel GM45 Video" rev 0x07
drm0 at inteldrm0
intagp0 at inteldrm0
agp0 at intagp0: aperture at 0xd000, size 0x1000
inteldrm0: msi
inteldrm0: 1440x900, 32bpp
wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation)
wsdisplay0: screen 1-5 added (std, vt100 emulation)
"Intel GM45 Video" rev 0x07 at pci0 dev 2 function 1 not configured
"Intel GM45 HECI" rev 0x07 at pci0 dev 3 function 0 not configured
pciide0 at pci0 dev 3 function 2 "Intel GM45 PT IDER" rev 0x07: DMA 
(unsupported), channel 0 wired to native-PCI, channel 1 wired to native-PCI
pciide0: using apic 1 int 18 for native-PCI interrupt
pciide0: channel 0 ignored (not responding; disabled or no drives?)
pciide0: channel 1 ignored (not responding; disabled or no drives?)
puc0 at pci0 dev 3 function 3 "Intel GM45 KT" rev 0x07: ports: 16 com
com4 at 

apmd(8): poll timer miscalculation

2018-06-07 Thread sunil+bugs
>Synopsis:  apmd(8) poll timer off by 10x
>Category:  system
>Environment:
System  : OpenBSD 6.3
Details : OpenBSD 6.3-current (GENERIC.MP) #54: Wed May 30 23:03:50 
MDT 2018
 
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64
>Description:
With apmd_flags="-Az10", expected apmd(8) to suspend when
battery is at 10%, however, it didn't check in time and
laptop ran of out power.
>How-To-Repeat:
Disconnect A/C adapter and run with -z percent greater
than current estimated battery life reported by apm(8);
poll every minute, for example...
# rcctl stop apmd
# apmd -A -z90 -t60
should suspend in a minute, however, it suspends after 10
minutes.
>Fix:
The following diff...

1. Provides a dedicated timer that fires every 10 seconds
instead of relying on EVFILT_READ freqency.

2. Increments a counter and checks against timeout value,
if it exceeds, invokes auto-action.

3. Wraps a few long lines that exceed 80 cols upon code
shuffle.

Index: apmd.c
===
RCS file: /cvs/src/usr.sbin/apmd/apmd.c,v
retrieving revision 1.81
diff -u -p -w -r1.81 apmd.c
--- apmd.c  15 Oct 2017 15:14:49 -  1.81
+++ apmd.c  7 Jun 2018 10:56:06 -
@@ -56,6 +56,9 @@
 #define AUTO_SUSPEND 1
 #define AUTO_HIBERNATE 2
 
+#define TIMO (10*60)   /* 10 minutes */
+#define TIMER_ID 1
+
 const char apmdev[] = _PATH_APM_CTLDEV;
 const char sockfile[] = _PATH_APM_SOCKET;
 
@@ -341,8 +344,6 @@ hibernate(int ctl_fd)
ioctl(ctl_fd, APM_IOC_HIBERNATE, 0);
 }
 
-#define TIMO (10*60)   /* 10 minutes */
-
 int
 main(int argc, char *argv[])
 {
@@ -353,13 +354,12 @@ main(int argc, char *argv[])
int statonly = 0;
int powerstatus = 0, powerbak = 0, powerchange = 0;
int noacsleep = 0;
-   struct timespec ts = {TIMO, 0}, sts = {0, 0};
struct apm_power_info pinfo;
-   time_t apmtimeout = 0;
+   time_t apmtimeout = TIMO, counter = 0;
const char *sockname = sockfile;
const char *errstr;
int kq, nchanges;
-   struct kevent ev[2];
+   struct kevent ev[3];
int ncpu_mib[2] = { CTL_HW, HW_NCPU };
int ncpu;
size_t ncpu_sz = sizeof(ncpu);
@@ -379,8 +379,8 @@ main(int argc, char *argv[])
sockname = optarg;
break;
case 't':
-   ts.tv_sec = strtoul(optarg, NULL, 0);
-   if (ts.tv_sec == 0)
+   apmtimeout = strtoul(optarg, NULL, 0);
+   if (apmtimeout == 0)
usage();
break;
case 's':   /* status only */
@@ -466,14 +466,15 @@ main(int argc, char *argv[])
 
EV_SET([0], sock_fd, EVFILT_READ, EV_ADD | EV_ENABLE | EV_CLEAR,
0, 0, NULL);
+   EV_SET([1], TIMER_ID, EVFILT_TIMER, EV_ADD, 0, 1, NULL);
if (ctl_fd == -1)
-   nchanges = 1;
+   nchanges = 2;
else {
-   EV_SET([1], ctl_fd, EVFILT_READ, EV_ADD | EV_ENABLE |
+   EV_SET([2], ctl_fd, EVFILT_READ, EV_ADD | EV_ENABLE |
EV_CLEAR, 0, 0, NULL);
-   nchanges = 2;
+   nchanges = 3;
}
-   if (kevent(kq, ev, nchanges, NULL, 0, ) < 0)
+   if (kevent(kq, ev, nchanges, NULL, 0, NULL) < 0)
error("kevent", NULL);
 
if (sysctl(ncpu_mib, 2, , _sz, NULL, 0) < 0)
@@ -482,41 +483,14 @@ main(int argc, char *argv[])
for (;;) {
int rv;
 
-   sts = ts;
-
-   apmtimeout += 1;
-   if ((rv = kevent(kq, NULL, 0, ev, 1, )) < 0)
+   if ((rv = kevent(kq, NULL, 0, ev, 1, NULL)) < 0)
break;
 
-   if (apmtimeout >= ts.tv_sec) {
-   apmtimeout = 0;
-
-   /* wakeup for timeout: take status */
-   powerbak = power_status(ctl_fd, 0, );
-   if (powerstatus != powerbak) {
-   powerstatus = powerbak;
-   powerchange = 1;
-   }
-
-   if (!powerstatus && autoaction &&
-   autolimit > (int)pinfo.battery_life) {
-   syslog(LOG_NOTICE,
-   "estimated battery life %d%%, "
-   "autoaction limit set to %d%% .",
-   pinfo.battery_life,
-   autolimit
-   );
-
-   if (autoaction == AUTO_SUSPEND)