[unveil] kernel panic for mfs/read-only filesystems during reboot
I'm got kernel panic on OpenBSD 6.4-beta (GENERIC.MP) #260: Sat Aug 25 02:10:42 MDT 2018 when experimenting with mfs and read-only filesystems. # shutdown -r now shutdown: unveil: Read-only file system # mount /dev/sd0a on / type ffs (local, read-only) mfs:35132 on /dev type mfs (asynchronous, local, noexec, nosuid, size=102400 512-blocks) mfs:74539 on /etc type mfs (asynchronous, local, nodev, nosuid, size=102400 512-blocks) mfs:26844 on /tmp type mfs (asynchronous, local, nodev, noexec, nosuid, size=204800 512-blocks) mfs:54540 on /var type mfs (asynchronous, local, nodev, noexec, nosuid, size=102400 512-blocks) mfs:55695 on /var/log type mfs (asynchronous, local, nodev, noexec, nosuid, size=262144 512-blocks) /dev/sd0d on /usr type ffs (local, nodev, read-only) /dev/sd0e on /usr/local type ffs (local, nodev, nosuid, read-only) /dev/sd0f on /data type ffs (local, nodev, nosuid, read-only) # reboot panic: kernel diagnostic assertion "vp->v_uvcount == 0" failed: file "/usr/src/sys/kern/kern_unveil.c", line 748 Stopped at db_enter+0x12: popq%r11 TIDPIDUID PRFLAGS PFLAGS CPU COMMAND *469083 35132 0 0 01K mount_mfs 240741 31688 0 0x14000 0x2000 reaper db_enter() at db_enter+0x12 panic() at panic+0x120 __assert(816afd14,800021147ec0,0,ff006fc21338) at __assert+0x24 unveil_removevnode(b3fdf0663036209b) at unveil_removevnode+0xf2 dounmount_leaf(36a320cd3cded2ab,80096c00,0) at dounmount_leaf+0x69 dounmount(bf305e5e97ceba2d,80096c00,80008008) at dounmount+0xfa mfs_start(3167c81713baac45,80096c00,ff007f63e000) at mfs_start+0xf9 sys_mount(a36477f38b506169,150,80008008) at sys_mount+0x5b5 syscall(bc035fa0eb20afa4) at syscall+0x32a Xsyscall(6,15,7f7da610,15,7f7daaac,7f7daf8d) at Xsyscall+0x128 end of kernel end trace frame: 0x7f7dae00, count: 5 https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{1}> show panic kernel diagnostic assertion "vp->v_uvcount == 0" failed: file "/usr/src/sys/ker n/kern_unveil.c", line 748 ddb{1}> show uvm Current UVM status: pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12 504236 VM pages: 10856 active, 2913 inactive, 0 wired, 460030 free (57336 zer o) min 10% (25) anon, 10% (25) vnode, 5% (12) vtext freemin=16807, free-target=22409, inactive-target=0, wired-max=168078 faults=131805, traps=130457, intrs=12942, ctxswitch=38035 fpuswitch=0 softint=15328, syscalls=340659, kmapent=16 fault counts: noram=0, noanon=0, noamap=0, pgwait=0, pgrele=0 ok relocks(total)=7550(7550), anget(retries)=49584(0), amapcopy=56876 neighbor anon/obj pg=3853/52486, gets(lock/unlock)=24354/7550 cases: anon=40560, anoncow=9024, obj=20857, prcopy=3497, przero=57861 daemon and swap counts: woke=0, revs=0, scans=0, obscans=0, anscans=0 busy=0, freed=0, reactivate=0, deactivate=0 pageouts=0, pending=0, nswget=0 nswapdev=1 swpages=526127, swpginuse=0, swpgonly=0 paging=0 kernel pointers: objs(kern)=0x81ca11a0 ddb{1}> show bcstats Current Buffer Cache status: numbufs 5869 busymapped 0, delwri 12 kvaslots 6302 avail kva slots 6302 bufpages 22694, dmapages 22694, dirtypages 21 pendingreads 0, pendingwrites 0 highflips 0, highflops 0, dmaflips 0 ddb{1}> trace db_enter() at db_enter+0x12 panic() at panic+0x120 __assert(816afd14,800021147ec0,0,ff006fc21338) at __assert+0x24 unveil_removevnode(b3fdf0663036209b) at unveil_removevnode+0xf2 dounmount_leaf(36a320cd3cded2ab,80096c00,0) at dounmount_leaf+0x69 dounmount(bf305e5e97ceba2d,80096c00,80008008) at dounmount+0xfa mfs_start(3167c81713baac45,80096c00,ff007f63e000) at mfs_start+0xf9 sys_mount(a36477f38b506169,150,80008008) at sys_mount+0x5b5 syscall(bc035fa0eb20afa4) at syscall+0x32a Xsyscall(6,15,7f7da610,15,7f7daaac,7f7daf8d) at Xsyscall+0x128 end of kernel end trace frame: 0x7f7dae00, count: -10 - steps to reproduce: * create fs setup as in mount output below * shutdown -r * reboot # mount /dev/sd0a on / type ffs (local, read-only) mfs:35132 on /dev type mfs (asynchronous, local, noexec, nosuid, size=102400 512-blocks) mfs:74539 on /etc type mfs (asynchronous, local, nodev, nosuid, size=102400 512-blocks) mfs:26844 on /tmp type mfs (asynchronous, local, nodev, noexec, nosuid, size=204800 512-blocks) mfs:54540 on /var type mfs (asynchronous, local, nodev, noexec, nosuid, size=102400 512-blocks) mfs:55695 on /var/log type mfs (asynchronous, local, nodev, noexec, nosuid, size=262144 512-blocks) /dev/sd0d on /usr type ffs (local, nodev, read-only) /dev/sd0e on /usr/local type ffs (local, nodev, nosuid, read-only) /dev/sd0f on /data type
Re: axen Ethernet device errors on both USB3.0 and USB2.0 ports
On Fri, Aug 24, 2018 at 06:02:20AM +, sc.dy...@gmail.com wrote: > On 2018/08/19 09:40, Stefan Sperling wrote: > > On Sun, Aug 19, 2018 at 11:05:04AM +0200, Stefan Sperling wrote: > >> On Sun, Aug 19, 2018 at 09:56:33AM +0200, Remi Locherer wrote: > >>> It would help if you could send a clean version that applies to -current. > >> > >> One of the attachments was in fact clean but yes, this > >> thread has been much too noisy to follow easily. > >> > >> Try this. > > > > Unfortunately, while this diff does indeed work on xhci(4), I've just > > found that this diff breaks axen(4) attached to ehci(4) completely. > > > > I see several "axen0: rxeof: too short transfer" in dmesg and > > almost all packets are lost. Even my Ethernet switch gives up > > eventually and disables the port. > > > > So this diff is not ready to be committed. > > I didn't check if axen works on ehci. > On my ehci (intel PCH) that bug is reproduced, and > I found that it works on ehci with 16kB RX buffer. > I preserve the original bufsz decision code. I applied axen5.diff and xhci.diff and tested the resulting kernel on an old Samsung notebook that has ehci and xhci (demesg and usbdevs below). When the axen dongle is attached via xhci it gets link but dhclient never gets a lease. This works when attached via ehci. But after some light traffic (browsing with netsurf) the systme panics. Here the output from ddb (copied by hand): kernel: page fault trap, code=0 Stopped at memcpy+0x15:repe movsq (%rsi),%es:(rdi) ddb{1}> show panic kernel page fault uvm_fault(0xffdef19438, 0x0, 0, 1) -> e memcpy(79e3..) at memcpy+0x15 end trace frame: 0x800032e06cd0, cound: 0 ddb{1} trace memcpy(79e...) at memcpy+0x15 ptcread(5b11cd.) at ptcread+0x1eb spec_read(70e.) at spec_read+0xab VOP_READ(4b037..) at VOP_RAED+0x49 vn_read(af8b.) at dofilereadv+0xe0 sys_read(9862) at sys_read+0x5c syscall(822b.) at syscall+0x32a Xsyscall(0,3,0,3,f,1954e...) at Xsyscall+0x128 end of kernel end trace frame 0x7f7d3430, count: -9 ddb{1}> mach ddb 0 Stopped at x86_ipi_db+0x12: popq%r11 ddb{0}> trace x86_ipi_db(5d...) at x86_ipi_db+0x12 x86_ipi_handler() at x86_ipi_handler+0x80 Xresume_lapic_ipi(9,ff.) at Xresume_lapic_ipi+0x23 ___mp_lock(58ifaff) at ___mp_lock+0x68 intr_handler(a26f9) at intr_handler+0x40 Xintr_ioapic_edge12_untramp(6,fff...) at Xintr_ioapic_edge12_untramp+0x19f ___mp_lock(58faff...) at___mp_lock+0x68 intr_handler(a26f9) at intr_handler+040 Xintr_ioapic_edge25_untramp(0,3,..) at Xintr_ioapic_edge25_untramp+0x19f acpicpu_idle() at acpicpu_idle+0x166 sched_idle(0) at sced_idle+0x245 end trace frame: 0x0, count: -11 ddb{0} This does not happen when running a snapshot kernel. dmesg + usbdevs -vvv OpenBSD 6.4-beta (GENERIC.MP) #0: Sat Aug 25 19:45:29 CEST 2018 r...@530u.relo.ch:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 8462659584 (8070MB) avail mem = 8196993024 (7817MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.6 @ 0xe0840 (63 entries) bios0: vendor Phoenix Technologies Ltd. version "05XK" date 02/10/2012 bios0: SAMSUNG ELECTRONICS CO., LTD. 530U3BI/530U4BI/530U4BH acpi0 at bios0: rev 2 acpi0: sleep states S0 S1 S3 S4 S5 acpi0: tables DSDT FACP SLIC SSDT ASF! HPET APIC MCFG SSDT SSDT UEFI UEFI UEFI acpi0: wakeup devices P0P1(S4) GLAN(S4) HDEF(S4) PXSX(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) PXSX(S4) RP06(S4) PXSX(S4) RP07(S4) PXSX(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 14318179 Hz acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i5-2467M CPU @ 1.60GHz, 1597.58 MHz, 06-2a-07 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Core(TM) i5-2467M CPU @ 1.60GHz, 1895.69 MHz, 06-2a-07 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 0, core 1, package 0 ioapic0 at mainbus0: apid 14 pa 0xfec0, version 20, 24 pins acpimcfg0 at acpi0 acpimcfg0: addr 0xf800, bus 0-63
Re: Plugging in ADB-enabled Android device makes kernel panic with xhci
On Thu, Aug 23, 2018 at 08:45:54PM +0900, Tom Murphy wrote: > I've narrowed it down. > >Last kernel where adb works: June 24 09:59:46 MDT 2018 >1st Kernel where adb panics: June 25 13:10:32 MDT 2018 > > I did notice when my phone's booted into LineageOS and I have ADB >turned on, when I connect the phone via USB I get: > >ugen1 at uhub0 port 7 "motorola XT1039" rev 2.00/2.28 addr 8 > > However, I'm not able to actually connect to it with adb shell or >anything else. It says: Error device offline or something. > > When I boot the phone into recovery mode, the phone shows up like >this when I plug it in: > > ugen1 at uhub0 port 7 "Motorola Moto G LTE" rev 2.00/2.28 addr 4 > > (different name!) and I am able use adb shell, adb push/pull, etc.. > > I think there's some issue with LineageOS' ADB mode, but that's not >really relevant here (it's a separate issue and outside of OpenBSD >perhaps though I'll have to test with Linux or some other OS.) > > I'm going to look at the commits next. > >-Tom I can verify that this commit is what makes the kernel panic when adb is run and an Android device is connected to the machine with ADB enabled: https://marc.info/?l=openbsd-cvs=152996258723362=2 CVSROOT:/cvs Module name:src Changes by: v...@cvs.openbsd.org2018/06/25 10:06:27 Modified files: sys/kern : vfs_syscalls.c lib/libc/sys : dup.2 Log message: During open(2), release the fdp lock before calling vn_open(9). This lets other threads of the process modify the file descriptor table even if the vn_open(9) call blocks. The change has an effect on dup2(2) and dup3(2). If the new descriptor is the same as the one reserved by an unfinished open(2), the system call will fail with error EBUSY. The accept(2) system call already behaves like this. Issue pointed out by art@ via mpi@ Tested in a bulk build by ajacoutot@ OK mpi@ * * * I tested kernels compiled just before that commit and right after, and that commit makes the kernel panic. -Tom
Re: Plugging in ADB-enabled Android device makes kernel panic with xhci
On Thu, Aug 23, 2018 at 08:45:54PM +0900, Bryan Linton wrote: > So I found some time to try to bisect this, but was hampered by my > phone being somewhat temperamental. > > Everything up to July 3rd was fine. No crashes occurred. > > On a July 15th checkout, my system panicked when trying to run adb > with my phone connected. > > Unfortunately when I tried to bisect this further, my phone began > refusing to connect to my computer. I get a generic > "uhub0: device problem, disabling port 2" > error and cannot get my phone to attach to my computer even if I > reboot it, plug/unplug it, etc. > > I'll see if I can try to bisect this further once I figure out > what the problem is with my phone, but in the meantime, I wanted > to at least update the bugs@ list with my findings so far. > > I see a few potential commits in that time-frame that could be > responsible, so I'm going to see if I can manage to narrow this > down even further. > > -- > Bryan Hi Bryan, I've narrowed it down. Last kernel where adb works: June 24 09:59:46 MDT 2018 1st Kernel where adb panics: June 25 13:10:32 MDT 2018 I did notice when my phone's booted into LineageOS and I have ADB turned on, when I connect the phone via USB I get: ugen1 at uhub0 port 7 "motorola XT1039" rev 2.00/2.28 addr 8 However, I'm not able to actually connect to it with adb shell or anything else. It says: Error device offline or something. When I boot the phone into recovery mode, the phone shows up like this when I plug it in: ugen1 at uhub0 port 7 "Motorola Moto G LTE" rev 2.00/2.28 addr 4 (different name!) and I am able use adb shell, adb push/pull, etc.. I think there's some issue with LineageOS' ADB mode, but that's not really relevant here (it's a separate issue and outside of OpenBSD perhaps though I'll have to test with Linux or some other OS.) I'm going to look at the commits next. -Tom