Re: machines running GNOME dying

2020-11-05 Thread Mikolaj Kucharski
I've started experiencing this recently as well. X11 with cwm (no GNOME)
with Chromium running. Everything frezes, except caps lock LED works and
I can type `boot reboot` or `call cpu_reset` and system reboots.

Today's hand (I think it was) on:

OpenBSD 6.8-current (GENERIC.MP) #152: Thu Oct 29 15:48:34 MDT 2020
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

2020-10-30T14:03:48.738Z mbx-0013 /bsd: WARNING: / was not properly unmounted
2020-10-30T14:22:50.934Z mbx-0013 /bsd: WARNING: / was not properly unmounted
2020-11-04T14:35:17.126Z mbx-0013 /bsd: WARNING: / was not properly unmounted
2020-11-05T13:50:58.049Z mbx-0013 /bsd: WARNING: / was not properly unmounted

Issues happend so far I think 4 times, around above timeframe, based on
reboots just after the hang.

When freeze happens I see only X11 output frozen, no output of panic
message or if I type (blindly, in ddb I think) bt, show panic, boot
reboot no output of those commands, so I cannot provide anything useful.

Any tips how I can capture bt, show panic or any other info?

Most annying part of this hang is, it affects Chromium in a way that I
loose last session information. Browser asks to restore session but my
tabs are gone and browser starts with no tabs or only few are restored.

My current dmesg is:

OpenBSD 6.8-current (GENERIC.MP) #161: Wed Nov  4 10:14:02 MST 2020
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 8302006272 (7917MB)
avail mem = 8035102720 (7662MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0x7f114000 (64 entries)
bios0: vendor HUAWEI version "2.00" date 11/07/2017
bios0: HUAWEI HUAWEI MateBook X
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP UEFI UEFI ECDT SSDT MSDM SSDT SSDT SSDT ASPT BOOT HPET 
APIC MCFG SSDT WSMT SSDT DBGP DBG2 SSDT SSDT DMAR NHLT FPDT BGRT
acpi0: wakeup devices GLAN(S4) XHC_(S3) XDCI(S4) HDAS(S4) RP01(S4) PXSX(S4) 
RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) PXSX(S4) 
RP06(S4) PXSX(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpiec0 at acpi0
acpihpet0 at acpi0: 2399 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz, 2595.04 MHz, 06-8e-09
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz, 2592.98 MHz, 06-8e-09
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 1 (application processor)
cpu2: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz, 2593.96 MHz, 06-8e-09
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 1, core 0, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz, 2593.96 MHz, 06-8e-09
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,T

Re: machines running GNOME dying

2020-10-23 Thread Stuart Henderson
... I should have also said; I don't know for sure that it's specific
to GNOME, that's just what the machines happen to be running.

Test machine arrived, now I have to find one of the silly video
cables with the right type of HDMI connector for them...


On 2020/10/23 09:44, Stuart Henderson wrote:
> I have some NUCs running GNOME/gdm (following the pkg-readme
> instructions). Typically the only application they're running is
> Chromium. The people using them have reported them hanging at times.
> The last one I had a report before rebooting and had a few things
> tried; numlock light responds; they came back after blind typing
> "boot r" so it seems likely that they might be in DDB.
> 
> Unsure if it's related or not but there are some warnings in
> .cache/gdm/session.log. Some seem "normal" i.e.  there are plenty
> of them including at times when no problems are reported
> 
> WARNING: Kernel has no file descriptor comparison support: Resource 
> temporarily unavailable
> WARNING: Kernel has no file descriptor comparison support: Resource 
> temporarily unavailable
> 
> There's also something from gsd-media-keys which has often
> (perhaps always) been seen in logs before hangs; this in session.log
> 
> (gsd-media-keys:15422): media-keys-plugin-WARNING **: 12:29:11.705: Unable to 
> get default sink
> [8310:834619968:1022/123242.016898:ERROR:mdns_responder.cc(885)] The mDNS 
> responder manager is not started yet.
> [8310:834619968:1022/124616.862337:ERROR:mdns_responder.cc(885)] The mDNS 
> responder manager is not started yet.
> WARNING: Kernel has no file descriptor comparison support: Resource 
> temporarily unavailable
> WARNING: Kernel has no file descriptor comparison support: Resource 
> temporarily unavailable
> 
> and some vfprint NULL in syslog:
> 
> Oct 21 14:00:02 syslogd[14226]: restart
> Oct 21 18:08:47 /bsd: drm:pid81340:intel_pipe_update_start *ERROR* [drm] 
> *ERROR* Potential atomic update failure on pipe A
> Oct 21 18:08:54 /bsd: drm:pid81340:intel_pipe_update_start *ERROR* [drm] 
> *ERROR* Potential atomic update failure on pipe A
> Oct 22 12:28:31 /bsd: drm:pid81340:intel_pipe_update_start *ERROR* [drm] 
> *ERROR* Potential atomic update failure on pipe A
> Oct 22 12:29:11 pulseaudio[32185]: [sndio] module-sndio.c: POLLHUP!
> Oct 22 12:29:11 gsd-media-keys: vfprintf %s NULL in 
> "gvc-mixer-output-set-property - card port name: %s"
> Oct 22 12:29:12 gnome-shell: vfprintf %s NULL in 
> "gvc-mixer-output-set-property - card port name: %s"
> 
> Obviously the null vfprintf aren't the direct cause of a machine crash
> but very often seen as the last syslog entry before rebooting a machine
> that has crashed - it might be a red herring but could also be a clue.
> 
> Running 6.8 now but they've seen similar hangs since at least 6.7
> (they've run various snaps in between), maybe earlier.
> 
> I'm getting one of the machines sent to me to see if I can find a way to
> reproduce it (not sure if I'll be able to get anything more from it even
> if I do; there's no serial console and I don't think they have AMT)
> but wondered if this rings any bells or if anyone has ideas.
> 
> 
> OpenBSD 6.8 (GENERIC.MP) #98: Sun Oct  4 18:13:26 MDT 2020
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 4169572352 (3976MB)
> avail mem = 4028157952 (3841MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xec240 (83 entries)
> bios0: vendor Intel Corp. version "WYLPT10H.86A.0047.2018.0303.1725" date 
> 03/03/2018
> bios0: Intel Corporation D34010WYK
> acpi0 at bios0: ACPI 5.0
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP APIC FPDT FIDT SSDT SSDT MCFG HPET SSDT SSDT DMAR CSRT
> acpi0: wakeup devices RP01(S4) PXSX(S4) PXSX(S4) PXSX(S4) RP04(S4) PXSX(S4) 
> PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) GLAN(S4) EHC1(S4) EHC2(S4) XHC_(S4) 
> HDEF(S4) PEG0(S4) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz, 1696.40 MHz, 06-45-01
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> cpu0: apic clock running at 99MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz, 1696.09 MHz, 06-45-01
> cpu1: 
> FPU,