amdgpu: Atom BIOS from ACPI
Dear current-users, In the patch below, I have implemented the few missing pieces for loading the Atom BIOS from ACPI which is needed when booting via UEFI. Now the BIOS loads for me (but amdgpu does not work yet). I am trying to get amdgpu working on a Ryzen 7 5700G with (from lspci) Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] (rev c8) I guess this will be supported if/when drm is next updated, but I have patched amdgpu_drv.c below to start looking at it. The next issue is "PSP load tmr failed" ... -- Kind regards, Yorick Hardy Index: sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_drv.c === RCS file: /cvsroot/src/sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_drv.c,v retrieving revision 1.8 diff -u -r1.8 amdgpu_drv.c --- sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_drv.c 19 Dec 2021 12:23:42 - 1.8 +++ sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_drv.c 5 Jan 2024 07:32:12 - @@ -1013,6 +1013,7 @@ /* Renoir */ {0x1002, 0x1636, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU}, + {0x1002, 0x1638, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU}, /* Navi12 */ {0x1002, 0x7360, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI12|AMD_EXP_HW_SUPPORT}, Index: sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_bios.c === RCS file: /cvsroot/src/sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_bios.c,v retrieving revision 1.6 diff -u -r1.6 amdgpu_bios.c --- sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_bios.c 27 Feb 2022 14:23:24 - 1.6 +++ sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_bios.c 5 Jan 2024 07:32:09 - @@ -402,7 +402,7 @@ return amdgpu_asic_read_disabled_bios(adev); } -#ifdef CONFIG_ACPI +#if defined(CONFIG_ACPI) || (NACPICA > 0) static bool amdgpu_acpi_vfct_bios(struct amdgpu_device *adev) { struct acpi_table_header *hdr; @@ -412,7 +412,11 @@ if (!ACPI_SUCCESS(acpi_get_table("VFCT", 1, ))) return false; +#ifdef __NetBSD__ + tbl_size = hdr->Length; +#else tbl_size = hdr->length; +#endif if (tbl_size < sizeof(UEFI_ACPI_VFCT)) { DRM_ERROR("ACPI VFCT table present but broken (too short #1)\n"); return false; Index: sys/external/bsd/drm2/include/linux/acpi.h === RCS file: /cvsroot/src/sys/external/bsd/drm2/include/linux/acpi.h,v retrieving revision 1.10 diff -u -r1.10 acpi.h --- sys/external/bsd/drm2/include/linux/acpi.h 28 May 2022 01:07:47 - 1.10 +++ sys/external/bsd/drm2/include/linux/acpi.h 5 Jan 2024 07:50:25 - @@ -56,6 +56,7 @@ union acpi_object *acpi_evaluate_dsm_typed(acpi_handle, const guid_t *, uint64_t, uint64_t, union acpi_object *, acpi_object_type); bool acpi_check_dsm(acpi_handle, const guid_t *, uint64_t, uint64_t); +u32 acpi_get_table(const char*, u32, struct acpi_table_header **); #endif /* NACPICA > 0 */ #endif /* _LINUX_ACPI_H_ */ Index: sys/external/bsd/drm2/linux/linux_acpi.c === RCS file: /cvsroot/src/sys/external/bsd/drm2/linux/linux_acpi.c,v retrieving revision 1.2 diff -u -r1.2 linux_acpi.c --- sys/external/bsd/drm2/linux/linux_acpi.c28 Feb 2022 17:15:30 - 1.2 +++ sys/external/bsd/drm2/linux/linux_acpi.c5 Jan 2024 07:50:29 - @@ -120,3 +120,9 @@ return true; return false; } + +u32 +acpi_get_table(const char* signature, u32 instance, struct acpi_table_header **hdr) +{ + return AcpiGetTable(signature, instance, hdr); +}
Re: Serious crashes on 9.99.93
Dear current-users, On 2021-12-28, pin wrote: > ‐‐‐ Original Message ‐‐‐ > > Unfortunately, I had to wipe my disk due to a corrupted file system. > I did a fresh install using the 9.99.93 image from today 28-Dec-2021 08:57 > > The same crash happened the first time I launched the web browser. > This time I got a stack backtrace, as follows: > > > pin@mybox # pwd > /var/crash > pin@mybox # ls -l > total 797032 > -rw--- 1 root wheel 2 Dec 28 20:41 bounds > -rw--- 1 root wheel 5 Dec 28 00:19 minfree > -rw--- 1 root wheel2332254 Dec 28 20:41 netbsd.0 > -rw--- 1 root wheel 405485080 Dec 28 20:41 netbsd.0.core > pin@mybox # gdb --eval-command="file /netbsd.gdb" > GNU gdb (GDB) 11.0.50.20200914-git > Copyright (C) 2020 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. > Type "show copying" and "show warranty" for details. > This GDB was configured as "x86_64--netbsd". > Type "show configuration" for configuration details. > For bug reporting instructions, please see: > <https://www.gnu.org/software/gdb/bugs/>. > Find the GDB manual and other documentation resources online at: > <http://www.gnu.org/software/gdb/documentation/>. > > For help, type "help". > Type "apropos word" to search for commands related to "word". > Reading symbols from /netbsd.gdb... > (gdb) target kvm netbsd.0.core > 0x802261f5 in cpu_reboot (howto=howto@entry=260, > bootstr=bootstr@entry=0x0) at /usr/src/sys/arch/amd64/amd64/machdep.c:720 > 720 /usr/src/sys/arch/amd64/amd64/machdep.c: No such file or directory. > (gdb) bt > #0 0x802261f5 in cpu_reboot (howto=howto@entry=260, > bootstr=bootstr@entry=0x0) at /usr/src/sys/arch/amd64/amd64/machdep.c:720 > #1 0x80dbcd54 in kern_reboot (howto=howto@entry=260, > bootstr=bootstr@entry=0x0) at /usr/src/sys/kern/kern_reboot.c:73 > #2 0x80dffcf2 in vpanic (fmt=fmt@entry=0x81390370 "trap", > ap=ap@entry=0x8580b77ebab8) at /usr/src/sys/kern/subr_prf.c:290 > #3 0x80dffdb7 in panic (fmt=fmt@entry=0x81390370 "trap") at > /usr/src/sys/kern/subr_prf.c:209 > #4 0x80229017 in trap (frame=0x8580b77ebc00) at > /usr/src/sys/arch/amd64/amd64/trap.c:326 > #5 0x802210e3 in alltraps () > #6 0xff404040ff404040 in ?? () > #7 0x0019 in ?? () > #8 0x80f752dea000 in ?? () > #9 0x in ?? () > (gdb) > > > > This did not happen before on 9.99.92 up to the 15th of December. > Hope that this is useful. > > Best Regards I tried to reproduce the problems that pin is seeing. I seldom got as far as the web browser, but did get the following backtraces: Crash version 9.99.42, image version 9.99.93. WARNING: versions differ, you may not be able to examine this image. System panicked: pr_phinpage_check: [i915_request] item 0x939b82a4f840 not part of pool Backtrace from time of crash is available. _KERNEL_OPT_GENFB_GLYPHCACHE() at 0 _KERNEL_OPT_GENFB_GLYPHCACHE() at 0 sys_reboot() at sys_reboot vpanic() at vpanic+0x160 device_printf() at device_printf pool_cache_put_paddr() at pool_cache_put_paddr+0x14f linux_dma_resv_fini() at linux_dma_resv_fini+0x55 __i915_gem_free_object_rcu() at __i915_gem_free_object_rcu+0x45 gc_thread() at gc_thread+0x92 Crash version 9.99.42, image version 9.99.93. WARNING: versions differ, you may not be able to examine this image. System panicked: trap Backtrace from time of crash is available. _KERNEL_OPT_GENFB_GLYPHCACHE() at 0 ?() at bf014f2a sys_reboot() at sys_reboot vpanic() at vpanic+0x160 device_printf() at device_printf startlwp() at startlwp calltrap() at calltrap+0x19 ffs_sync() at ffs_sync+0x75 VFS_SYNC() at VFS_SYNC+0x22 sched_sync() at sched_sync+0x90 Could the first backtrace be a clue? Changes were made in this area after the 15th, which is when pin still had a stable kernel. -- Kind regards, Yorick Hardy
Re: Serious crashes on 9.99.93
Dear pin, On 2021-12-28, pin wrote: > > > Sent with ProtonMail Secure Email. > > ‐‐‐ Original Message ‐‐‐ > > On Tuesday, December 28th, 2021 at 16:29, Yorick Hardy > wrote: > > > Dear pin, > > > > On 2021-12-27, pin wrote: > > > > > Hi all, > > > > > > I've upgraded my amd64 machine to NetBSD-9.99.93 yesterday and I'm > > > experience serious crashes which were not happening on 9.99.92. > > > > > > dmesg, https://pastebin.com/8WJeUJDj > > > > > > Xorg-log, https://pastebin.com/xTAmUZPU > > > > > > The backtraces from the coredumps aren't really useful, > > > https://pastebin.com/eaXYEC0Z > > > > > > I've managed to reproduce the crashes by launching lariza or badwolf web > > > browsers. > > > > > > The system runs without issues if I don't use a web browser. > > > > > > Also, I've noticed the following while booting after a crash > > > > > > panic: kernel diagnostic assertion "solocked2(so, so2)" failed: file > > > "/usr/src/sys/kern/uipc_usrreq.c", line 525 > > > > > > Finally, unsure if related, console resolution doesn't scale after > > > loading i915drmkms0, it used to in 9.99.92. > > > > > > Although, resolution after startx is correct. > > > > > > I've hosted the core-dumps in case, > > > > > > netbsd.2.core.gz, https://ufile.io/d00lfx4f > > > > > > netbsd.2.gz, https://ufile.io/4yvklq5w > > > > > > Thank you for any hints. > > > > > > Best, > > > > > > pin > > > > > > Sent with ProtonMail Secure Email. > > > > Somehow I managed to get a backtrace (it seems to be correct): > > > > Crash version 9.99.82, image version 9.99.93. > > > > WARNING: versions differ, you may not be able to examine this image. > > > > crash: _kvm_kvatop(0) > > > > Kernel compiled without options LOCKDEBUG. > > > > System panicked: kernel diagnostic assertion "solocked2(so, so2)" failed: > > file "/usr/src/sys/kern/uipc_usrreq.c", line 525 > > > > Backtrace from time of crash is available. > > > > crash> bt > > > > _KERNEL_OPT_GENFB_GLYPHCACHE() at 0 > > > > _KERNEL_OPT_GENFB_GLYPHCACHE() at 0 > > > > sys_reboot() at sys_reboot > > > > vpanic() at vpanic+0x160 > > > > __x86_indirect_thunk_rax() at __x86_indirect_thunk_rax > > > > unp_send() at unp_send+0xa15 > > > > sosend() at sosend+0x845 > > > > soo_write() at soo_write+0x2f > > > > do_filewritev.part.0() at do_filewritev.part.0+0x25d > > > > syscall() at syscall+0x196 > > > > --- syscall (number 121) --- > > > > syscall+0x196: > > > > crash> > > > > the last change I could find which might be relevant was in > > sys/kern/sys_generic.c 1.133, but that > > > > was before 9.99.93, so I am not sure where to look. > > > > -- > > > > Kind regards, > > > > Yorick Hardy > > Thanks! > > Was this change made in 9.99.92 after the 15th of December? > > Until the above date this did not happen. > > Regards 11 September, so: long before! But thanks, I will continue to look for any changes after 15 December (I am sure someone more knowledgeable will find it, but I will have a go in the mean time). -- Kind regards, Yorick Hardy
Re: Serious crashes on 9.99.93
Dear pin, On 2021-12-27, pin wrote: > Hi all, > I've upgraded my amd64 machine to NetBSD-9.99.93 yesterday and I'm experience > serious crashes which were not happening on 9.99.92. > dmesg, https://pastebin.com/8WJeUJDj > Xorg-log, https://pastebin.com/xTAmUZPU > > The backtraces from the coredumps aren't really useful, > https://pastebin.com/eaXYEC0Z > > I've managed to reproduce the crashes by launching lariza or badwolf web > browsers. > The system runs without issues if I don't use a web browser. > Also, I've noticed the following while booting after a crash > > panic: kernel diagnostic assertion "solocked2(so, so2)" failed: file > "/usr/src/sys/kern/uipc_usrreq.c", line 525 > > Finally, unsure if related, console resolution doesn't scale after loading > i915drmkms0, it used to in 9.99.92. > Although, resolution after startx is correct. > > I've hosted the core-dumps in case, > netbsd.2.core.gz, https://ufile.io/d00lfx4f > netbsd.2.gz, https://ufile.io/4yvklq5w > > Thank you for any hints. > Best, > pin > > Sent with ProtonMail Secure Email. Somehow I managed to get a backtrace (it seems to be correct): Crash version 9.99.82, image version 9.99.93. WARNING: versions differ, you may not be able to examine this image. crash: _kvm_kvatop(0) Kernel compiled without options LOCKDEBUG. System panicked: kernel diagnostic assertion "solocked2(so, so2)" failed: file "/usr/src/sys/kern/uipc_usrreq.c", line 525 Backtrace from time of crash is available. crash> bt _KERNEL_OPT_GENFB_GLYPHCACHE() at 0 _KERNEL_OPT_GENFB_GLYPHCACHE() at 0 sys_reboot() at sys_reboot vpanic() at vpanic+0x160 __x86_indirect_thunk_rax() at __x86_indirect_thunk_rax unp_send() at unp_send+0xa15 sosend() at sosend+0x845 soo_write() at soo_write+0x2f do_filewritev.part.0() at do_filewritev.part.0+0x25d syscall() at syscall+0x196 --- syscall (number 121) --- syscall+0x196: crash> the last change I could find which might be relevant was in sys/kern/sys_generic.c 1.133, but that was before 9.99.93, so I am not sure where to look. -- Kind regards, Yorick Hardy
Re: Panic in usbd_create_xfer
On 2021-01-03, Yorick Hardy wrote: > Dear matthew, > > On 2021-01-03, matthew green wrote: > > Yorick Hardy writes: > > > Dear current-users, > > > > > > Happy new year! > > > > happy new year yorick! and everyone. > > > > > [ 659.839003] usbd_create_xfer() at netbsd:usbd_create_xfer+0x186 > > > [ 659.849001] usbd_open_pipe_intr() at netbsd:usbd_open_pipe_intr+0x74 > > > [ 659.849001] uhidev_open() at netbsd:uhidev_open+0x21c > > > > can you find out what lines in the source these are? > > espcially usbd_create_xfer+0x186, the other ones are > > most likely obvious only the single callers - eg, > > usbd_open_pipe_intr() calls usbd_create_xfer() once. > > > > thanks. > > > > > > .mrg. > > In the disassembly (I guess due to inlining) it happens at a call to > usb_allocmem, which I think is line sys/dev/usb/usbdi.c:606, i.e. the > call to usbd_alloc_buffer. > > I am current trying to trigger the panic with all of the USB_DEBUG and > {U,O,E}HCI_DEBUG options enabled but it has not happened yet (I was sure > I would be able to panic the kernel by now, maybe the _DEBUG options > worka round the panic somehow?). Reverting to src/sys/dev/usb/ohci.c revision 1.310 seems to solve the panic, I have not yet determined how this change could lead to a panic. -- Kind regards, Yorick Hardy
Re: Panic in usbd_create_xfer
Dear matthew, On 2021-01-03, matthew green wrote: > Yorick Hardy writes: > > Dear current-users, > > > > Happy new year! > > happy new year yorick! and everyone. > > > [ 659.839003] usbd_create_xfer() at netbsd:usbd_create_xfer+0x186 > > [ 659.849001] usbd_open_pipe_intr() at netbsd:usbd_open_pipe_intr+0x74 > > [ 659.849001] uhidev_open() at netbsd:uhidev_open+0x21c > > can you find out what lines in the source these are? > espcially usbd_create_xfer+0x186, the other ones are > most likely obvious only the single callers - eg, > usbd_open_pipe_intr() calls usbd_create_xfer() once. > > thanks. > > > .mrg. In the disassembly (I guess due to inlining) it happens at a call to usb_allocmem, which I think is line sys/dev/usb/usbdi.c:606, i.e. the call to usbd_alloc_buffer. I am current trying to trigger the panic with all of the USB_DEBUG and {U,O,E}HCI_DEBUG options enabled but it has not happened yet (I was sure I would be able to panic the kernel by now, maybe the _DEBUG options worka round the panic somehow?). -- Kind regards, Yorick Hardy
Re: Panic in usbd_create_xfer
Dear current-users, Happy new year! On 2020-12-30, Yorick Hardy wrote: > Dear current-users, > > Is anyone else seeing panics when opening a uhidev? (Generally from SDL.) > I am using a custom kernel with a uintuos (not commited) Wacom, ukbd > and ums hid devices. The panic only happens after a few days, the host > controller is ehci (ATI SB700). > > Some example panics: I had forgotten to include DIAGNOSTIC, now the panic shows a warning from ohci: [ 659.829010] uvm_fault(0xb2272ae9f480, 0x0, 1) -> e [ 659.829010] fatal page fault in supervisor mode [ 659.829010] trap type 6 code 0 rip 0x80357ce6 cs 0x8 rflags 0x210206 cr2 0 ilevel 0 rsp 0xce014cdbea30 [ 659.829010] curlwp 0xb2272638b9c0 pid 1880.1880 lowest kstack 0xce014cdba2c0 [ 659.829010] panic: trap [ 659.829010] cpu0: Begin traceback... [ 659.829010] ohci1: WARNING: addr 0x40054bc0 not found [ 659.829010] vpanic() at netbsd:vpanic+0x156 [ 659.839003] snprintf() at netbsd:snprintf [ 659.839003] startlwp() at netbsd:startlwp [ 659.839003] alltraps() at netbsd:alltraps+0xbb [ 659.839003] usbd_create_xfer() at netbsd:usbd_create_xfer+0x186 [ 659.849001] usbd_open_pipe_intr() at netbsd:usbd_open_pipe_intr+0x74 [ 659.849001] uhidev_open() at netbsd:uhidev_open+0x21c [ 659.849001] uhidopen() at netbsd:uhidopen+0xf9 [ 659.859002] cdev_open() at netbsd:cdev_open+0xae [ 659.859002] spec_open() at netbsd:spec_open+0x176 [ 659.869002] VOP_OPEN() at netbsd:VOP_OPEN+0x3c [ 659.869002] vn_open() at netbsd:vn_open+0x130 [ 659.869002] do_open() at netbsd:do_open+0x119 [ 659.879002] do_sys_openat() at netbsd:do_sys_openat+0x74 [ 659.879002] sys_open() at netbsd:sys_open+0x24 [ 659.889001] syscall() at netbsd:syscall+0x1cc [ 659.889001] --- syscall (number 5) --- [ 659.889001] netbsd:syscall+0x1cc: [ 659.889001] cpu0: End traceback... -- Kind regards, Yorick Hardy
Panic in usbd_create_xfer
Dear current-users, Is anyone else seeing panics when opening a uhidev? (Generally from SDL.) I am using a custom kernel with a uintuos (not commited) Wacom, ukbd and ums hid devices. The panic only happens after a few days, the host controller is ehci (ATI SB700). Some example panics: [ 86314.177679] uvm_fault(0x9eda7c67d6f8, 0x81, 1) -> e [ 86314.177679] fatal page fault in supervisor mode [ 86314.177679] trap type 6 code 0 rip 0x8035784f cs 0x8 rflags 0x10286 cr2 0x810702 ilevel 0 rsp 0x9f8156ab2a40 [ 86314.177679] curlwp 0x9edaa1c52900 pid 786.786 lowest kstack 0x9f8156aae2c0 [ 86314.177679] panic: trap [ 86314.177679] cpu1: Begin traceback... [ 86314.177679] vpanic() at netbsd:vpanic+0x156 [ 86314.187672] snprintf() at netbsd:snprintf [ 86314.187672] startlwp() at netbsd:startlwp [ 86314.187672] alltraps() at netbsd:alltraps+0xbb [ 86314.187672] usbd_create_xfer() at netbsd:usbd_create_xfer+0x186 [ 86314.197668] usbd_open_pipe_intr() at netbsd:usbd_open_pipe_intr+0x74 [ 86314.197668] uhidev_open() at netbsd:uhidev_open+0x18c [ 86314.207668] uhidopen() at netbsd:uhidopen+0xe1 [ 86314.207668] cdev_open() at netbsd:cdev_open+0xae [ 86314.207668] spec_open() at netbsd:spec_open+0x176 [ 86314.217669] VOP_OPEN() at netbsd:VOP_OPEN+0x3c [ 86314.217669] vn_open() at netbsd:vn_open+0x130 [ 86314.217669] do_open() at netbsd:do_open+0x119 [ 86314.227669] do_sys_openat() at netbsd:do_sys_openat+0x74 [ 86314.227669] sys_open() at netbsd:sys_open+0x24 [ 86314.237670] syscall() at netbsd:syscall+0x1cc [ 86314.237670] --- syscall (number 5) --- [ 86314.237670] netbsd:syscall+0x1cc: [ 86314.237670] cpu1: End traceback... Crash version 9.99.77, image version 9.99.77. crash: _kvm_kvatop(0) Kernel compiled without options LOCKDEBUG. System panicked: trap Backtrace from time of crash is available. crash> bt _KERNEL_OPT_NAGR() at 0 ?() at bb0150547000 sys_reboot() at sys_reboot vpanic() at vpanic+0x160 snprintf() at snprintf startlwp() at startlwp calltrap() at calltrap+0x11 usbd_create_xfer() at usbd_create_xfer+0x186 usbd_open_pipe_intr() at usbd_open_pipe_intr+0x74 uhidev_open() at uhidev_open+0x18c uhidopen() at uhidopen+0xe1 cdev_open() at cdev_open+0xae spec_open() at spec_open+0x176 VOP_OPEN() at VOP_OPEN+0x3c vn_open() at vn_open+0x130 do_open() at do_open+0x119 do_sys_openat() at do_sys_openat+0x74 sys_open() at sys_open+0x24 syscall() at syscall+0x1cc --- syscall (number 5) --- syscall+0x1cc: crash> Crash version 9.99.77, image version 9.99.77. crash: _kvm_kvatop(0) Kernel compiled without options LOCKDEBUG. System panicked: trap Backtrace from time of crash is available. crash> bt _KERNEL_OPT_NAGR() at 0 ?() at b7814c8e sys_reboot() at sys_reboot vpanic() at vpanic+0x160 snprintf() at snprintf startlwp() at startlwp calltrap() at calltrap+0x11 usbd_create_xfer() at usbd_create_xfer+0x186 usbd_open_pipe_intr() at usbd_open_pipe_intr+0x74 uhidev_open() at uhidev_open+0x18c uhidopen() at uhidopen+0xe1 cdev_open() at cdev_open+0xae spec_open() at spec_open+0x176 VOP_OPEN() at VOP_OPEN+0x3c vn_open() at vn_open+0x130 do_open() at do_open+0x119 do_sys_openat() at do_sys_openat+0x74 sys_open() at sys_open+0x24 syscall() at syscall+0x1cc --- syscall (number 5) --- syscall+0x1cc: -- Kind regards, Yorick Hardy
Re: build-success/install-fault on i486 with xsrc
Dear Maya, On 2020-11-01, m...@netbsd.org wrote: > On Sun, Nov 01, 2020 at 04:44:41PM +0100, Lizbeth Mutterhunt, Ph.D wrote: > > so! > > > > sucess in the built! had a hung-up at the latest test: > > > > _nv_cas.d.tmp atomic_or_64_nv_cas.d > > : fatal error: when writing output to : No space left on device > > > > but /tmp was empty and there were 4,5GB free. Damn old drag, did this > > business! now gonna to do some emulation for rooting mobile. > > > > thx and good luck in BSDing, > > if you update src, it will re-enable GLX_USE_TLS and should just work on > i386, no changes necessary. Apologies for resurrecting an old thread, I am not sure it is entirely related. I have not been able to use libGL on i386 for quite some time, the following core dump might provide some information (the back trace is unusable/empty). This is on current updated on 15 December. Unfortunately, I still do not know the cause of the SIGSEGV. Does glxgears work for anyone else (on i386+i915)? -- Kind regards, Yorick Hardy $ gdb /usr/X11R7/bin/glxgears /tmp/yorick.glxgears.core GNU gdb (GDB) 11.0.50.20200914-git Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486--netbsdelf". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/X11R7/bin/glxgears... (No debugging symbols found in /usr/X11R7/bin/glxgears) [New process 20022] Core was generated by `glxgears'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0xba370252 in _mesa_x86_cliptest_points4 () from /usr/X11R7/lib/modules/dri/i915_dri.so (gdb) disassemble Dump of assembler code for function _mesa_x86_cliptest_points4: 0xba3701e0 <+0>: push %esi 0xba3701e1 <+1>: push %edi 0xba3701e2 <+2>: push %ebp 0xba3701e3 <+3>: push %ebx 0xba3701e4 <+4>: call 0xba3701f8 <_mesa_x86_cliptest_points4+24> 0xba3701e9 <+9>: add$0x348e17,%ebx 0xba3701ef <+15>: lea0x30fe0(%ebx),%ebx 0xba3701f5 <+21>: push %ebx 0xba3701f6 <+22>: jmp0xba3701fc <_mesa_x86_cliptest_points4+28> 0xba3701f8 <+24>: mov(%esp),%ebx 0xba3701fb <+27>: ret 0xba3701fc <+28>: mov0x18(%esp),%esi 0xba370200 <+32>: mov0x1c(%esp),%edi 0xba370204 <+36>: mov0x20(%esp),%edx 0xba370208 <+40>: mov0x24(%esp),%ebx 0xba37020c <+44>: mov0x28(%esp),%ebp 0xba370210 <+48>: mov0x14(%esi),%eax 0xba370213 <+51>: mov0x10(%esi),%ecx 0xba370216 <+54>: mov0x8(%esi),%esi 0xba370219 <+57>: orl$0xf,0x1c(%edi) 0xba37021d <+61>: mov%eax,0x18(%esp) 0xba370221 <+65>: movl $0x4,0x18(%edi) 0xba370228 <+72>: mov%ecx,0x10(%edi) 0xba37022b <+75>: mov0x8(%edi),%edi 0xba37022e <+78>: add%edx,%ecx 0xba370230 <+80>: mov%ecx,0x20(%esp) 0xba370234 <+84>: cmp%ecx,%edx 0xba370236 <+86>: mov(%ebx),%al 0xba370238 <+88>: mov0x0(%ebp),%ah 0xba37023b <+91>: je 0xba3702e7 <_mesa_x86_cliptest_points4+263> 0xba370241 <+97>: lea0x0(%esi,%eiz,1),%esi 0xba370248 <+104>: lea0x0(%esi,%eiz,1),%esi 0xba37024f <+111>: nop 0xba370250 <+112>: fld1 => 0xba370252 <+114>: fdivs 0xc(%esi) 0xba370255 <+117>: mov0xc(%esi),%ebp 0xba370258 <+120>: mov0x8(%esi),%ebx 0xba37025b <+123>: xor%ecx,%ecx --Type for more, q to quit, c to continue without paging--
Re: uvm_map_enter entry merging (was Re: vrelel...)
Dear Chuck, On 2020-11-29, Chuck Silvers wrote: > hi Yorick, > > On Sat, Nov 28, 2020 at 12:39:56AM +0200, Yorick Hardy wrote: > > May I ask if you have an opinion on this patch? I have > > not noticed any bad behaviour if it is omitted but, if I read > > the code correctly, I don't think it is correct to fall through > > for this case. > > this function is very hard to follow, it's very tangled. > I stared at it for a while and I didn't see anything wrong, > but it's hard to be sure just from reading the code. I am sorry to have made more work for you, and thanks for looking at it! I agree, and since I have only a superficial understanding of how everything fits together I am not 100% sure that my reading is correct. > could you explain the specific case that you think is wrong now and > that your patch fixes? I will look at the tests as suggested below. Please read/ignore this part as you see fit. Superficially (and perhaps that is part of the "not reading correctly"!) 1) if we reach http://nxr.netbsd.org/xref/src/sys/uvm/uvm_map.c#1479 then one of two things happened (with merged == 1): a) an amap was extended b) neither prev_entry nor prev_entry->next have an amap 2) now we reach https://nxr.netbsd.org/xref/src/sys/uvm/uvm_map.c#1493 and since merged == 1 in (1) https://nxr.netbsd.org/xref/src/sys/uvm/uvm_map.c#1497 UVMMAP_EVCNT_DECR(ubackmerge); UVMMAP_EVCNT_INCR(ubimerge); but in case 1(b) the forward merge never happened? That was the starting point of my query. I thought perhaps that 1(b) never happens, but a printf shows that it happens often. I am "living dangerously" and running with the patch for now, although I am not sure yet where to look for any adverse effects. > even better would be if you could write a set of atf tests to exercise > all of the possible merge cases and verify that the contents of memory > after the new mapping is created is what it should be. > any previous and next mapping should have the same contents as before, > and the new mapping should have either zeroes (for a new amap mapping) > or the uobj contents at that offset (for a new uobj mapping). I did not think I could do that, I will look into it! (Energy is low again so, when I get to it ...) > note that a vm_map_entry can reference both a uobj and an amap at the > same time, so there are 4 possible cases for the each of previous and next > entries (none, uobj, amap, uobj+amap), and two possible cases for the > new entry (uobj, amap). then I guess there are two more factors of 2 > for whether the forward and/or backward merges succeed, so that gives > at least 128 cases to test. I think there are some more cases hidden > in there because there are multiple reasons why the merges might fail > and those checks are in different places, so it would really be best > to test all of the different possible paths through this function. > > I would be reluctant to change anything here without such a set of > comprehensive tests, because even if we are sure that a change fixes > one case, it would be very hard to be sure that it doesn't break > some other case. > > -Chuck Understood, it is quite possible that I am way out of my depth! -- Kind regards, Yorick Hardy
Re: Panic: vrelel: bad ref count (9.99.54)
Dear Chuck, On 2020-11-27, Chuck Silvers wrote: > Hi Yorick, > > On Fri, Nov 27, 2020 at 06:29:07PM +0200, Yorick Hardy wrote: > > > > I think that uvm_mremap did not keep pace with changes in uvm. > > This patch seems to fix it for me, although I have only tested > > for two days so far (I am usually able to trigger the panic by > > now ... but lets see). > > Your patch looks good, please go ahead and commit it. > > -Chuck Thanks! May I ask if you have an opinion on this patch? I have not noticed any bad behaviour if it is omitted but, if I read the code correctly, I don't think it is correct to fall through for this case. -- Kind regards, Yorick Hardy Index: uvm_map.c === RCS file: /cvsroot/src/sys/uvm/uvm_map.c,v retrieving revision 1.385 diff -u -r1.385 uvm_map.c --- uvm_map.c 9 Jul 2020 05:57:15 - 1.385 +++ uvm_map.c 19 Nov 2020 16:04:07 - @@ -1477,6 +1477,13 @@ amapwaitflag | AMAP_EXTEND_BACKWARDS)) goto nomerge; } + + /* +* We could not extend either amap, just skip on. +*/ + else { + goto nomerge; + } } else { /* * Pull the next entry's amap backwards to cover this
Re: Panic: vrelel: bad ref count (9.99.54)
Dear Andrew and Leonardo, On 2020-11-19, Yorick Hardy wrote: > Dear Andrew, > > On 2020-05-05, Andrew Doran wrote: > > On Mon, May 04, 2020 at 03:54:57PM +0200, Leonardo Taccari wrote: > > > Hello Yorick and Andrew, > > > > > > Yorick Hardy writes: > > > > > > > [...] > > > > > > > > > > > > > > Crash version 9.99.55, image version 9.99.55. > > > > > > > crash: _kvm_kvatop(0) > > > > > > > Kernel compiled without options LOCKDEBUG. > > > > > > > System panicked: vrelel: bad ref count > > > > > > > Backtrace from time of crash is available. > > > > > > > crash> bt > > > > > > > _KERNEL_OPT_NAGR() at 0 > > > > > > > ?() at 7f7ff7ecf000 > > > > > > > sys_reboot() at sys_reboot > > > > > > > vpanic() at vpanic+0x181 > > > > > > > vtryrele() at vtryrele > > > > > > > vcache_dealloc() at vcache_dealloc > > > > > > > uvm_unmap_detach() at uvm_unmap_detach+0x76 > > > > > > > uvm_unmap1() at uvm_unmap1+0x4e > > > > > > > uvm_mremap() at uvm_mremap+0x36b > > > > > > > sys_mremap() at sys_mremap+0x68 > > > > > > > syscall() at syscall+0x227 > > > > > > > --- syscall (number 411) --- > > > > > > > 797459842e9a: > > > > > > > crash> [ rest of thread omitted ] I think that uvm_mremap did not keep pace with changes in uvm. This patch seems to fix it for me, although I have only tested for two days so far (I am usually able to trigger the panic by now ... but lets see). Leonardo, would you be willing to try the patch? -- Kind regards, Yorick Hardy Index: sys/uvm/uvm_mremap.c === RCS file: /cvsroot/src/sys/uvm/uvm_mremap.c,v retrieving revision 1.20 diff -u -r1.20 uvm_mremap.c --- sys/uvm/uvm_mremap.c23 Feb 2020 15:46:43 - 1.20 +++ sys/uvm/uvm_mremap.c26 Nov 2020 19:14:06 - @@ -80,10 +80,8 @@ error = E2BIG; /* XXX */ goto done; } - rw_enter(uobj->vmobjlock, RW_WRITER); - KASSERT(uobj->uo_refs > 0); - atomic_inc_uint(>uo_refs); - rw_exit(uobj->vmobjlock); + if (uobj->pgops->pgo_reference) + uobj->pgops->pgo_reference(uobj); reserved_entry->object.uvm_obj = uobj; reserved_entry->offset = newoffset; }
Re: Panic: vrelel: bad ref count (9.99.54)
Dear Andrew, On 2020-05-05, Andrew Doran wrote: > On Mon, May 04, 2020 at 03:54:57PM +0200, Leonardo Taccari wrote: > > Hello Yorick and Andrew, > > > > Yorick Hardy writes: > > > > > > [...] > > > > > > > > > > > > Crash version 9.99.55, image version 9.99.55. > > > > > > crash: _kvm_kvatop(0) > > > > > > Kernel compiled without options LOCKDEBUG. > > > > > > System panicked: vrelel: bad ref count > > > > > > Backtrace from time of crash is available. > > > > > > crash> bt > > > > > > _KERNEL_OPT_NAGR() at 0 > > > > > > ?() at 7f7ff7ecf000 > > > > > > sys_reboot() at sys_reboot > > > > > > vpanic() at vpanic+0x181 > > > > > > vtryrele() at vtryrele > > > > > > vcache_dealloc() at vcache_dealloc > > > > > > uvm_unmap_detach() at uvm_unmap_detach+0x76 > > > > > > uvm_unmap1() at uvm_unmap1+0x4e > > > > > > uvm_mremap() at uvm_mremap+0x36b > > > > > > sys_mremap() at sys_mremap+0x68 > > > > > > syscall() at syscall+0x227 > > > > > > --- syscall (number 411) --- > > > > > > 797459842e9a: > > > > > > crash> > > > > > > > > > > The same just happened on 9.99.56 while fetching (POP) mail using > > > > > mail/fdm. > > > > > > > > Could you file a PR please? If this panics again could you please run > > > > the > > > > "dmesg" command in crash and find out what it printed about the vnode? > > > > That > > > > would be very useful. > > > > > > > > Thanks, > > > > Andrew > > > > > > I will do so (... perhaps only this weekend). > > > [...] > > > > I was able to reproduce it too with a yesterday evening NetBSD/amd64 > > -current when using mail/fdm and I will try to prepare a minimal > > reproducer using mail/fdm and file a PR if noone beat me. > > > > In the meantime here the information from dmesg: > > > > [ 6107.6380323] vnode 0xa95219747d40 flags 0x418 > > [ 6107.6380323]tag VT_TMPFS(25) type VREG(1) mount 0xa951f6d89000 > > typedata 0xa95255e32c90 > > [ 6107.6380323]usecount 1 writecount 1 holdcount 0 > > [ 6107.6380323]size 18000 writesize 18000 numoutput 0 > > [ 6107.6380323]data 0xa952583304a0 lock 0xa95219747f00 > > [ 6107.6380323]state LOADED key(0xa951f6d89000 8) a0 04 33 58 52 a9 > > ff ff > > [ 6107.6380323]lrulisthd 0x816b5ed0 > > [ 6107.6380323]tag VT_TMPFS, tmpfs_node 0xa952583304a0, flags 0x0, > > links 1 > > [ 6107.6380323]mode 0600, owner 1000, group 0, size 98304 > > [ 6107.6380323] panic: vrelel: bad ref count > > [ 6107.6380323] cpu0: Begin traceback... > > [ 6107.6380323] vpanic() at netbsd:vpanic+0x178 > > [ 6107.6480364] vnpanic() at netbsd:vnpanic+0x49 > > [ 6107.6480364] vrelel() at netbsd:vrelel+0x5b6 > > [ 6107.6480364] uvm_unmap_detach() at netbsd:uvm_unmap_detach+0x8e > > [ 6107.6480364] sys_munmap() at netbsd:sys_munmap+0x85 > > [ 6107.6480364] syscall() at netbsd:syscall+0x2a0 > > [ 6107.6480364] --- syscall (number 73) --- > > [ 6107.6480364] 7c1e5d18414a: > > [ 6107.6480364] cpu0: End traceback... > > [ 6107.6480364] fatal breakpoint trap in supervisor mode > > [ 6107.6480364] trap type 1 code 0 rip 0x802219fd cs 0x8 rflags > > 0x202 cr2 0x7f7ff7ee5000 ilevel 0 rsp 0xc100c227ae20 > > [ 6107.6480364] curlwp 0xa9521e1b1600 pid 20756.20756 lowest kstack > > 0xc100c22772c0 > > [ 6107.6480364] dumping to dev 0,1 (offset=276847, size=2062664): > > [ 6107.6480364] dump > > > > If any possible further information is needed do not hesitate to > > contact me! > > > > > > Thanks! > > Thank you. I opened PR 55237 to track so I don't forget. > > Andrew I am still trying to track this down, but I can only understand small pieces of the code at the moment. While going through uvm_map_enter in sys/uvm/uvm_map.c, it looks like there is an unhandled case (patch below). Is this correct? It seems to happen quite often, but with or without the patch the system seems equally (un)stable. -- Kind regards, Yorick Hardy Index: uvm_map.c === RCS file: /cvsroot/src/sys/uvm/uvm_map.c,v retrieving revision 1.385 diff -u -r1.385 uvm_map.c --- uvm_map.c 9 Jul 2020 05:57:15 - 1.385 +++ uvm_map.c 19 Nov 2020 16:04:07 - @@ -1477,6 +1477,13 @@ amapwaitflag | AMAP_EXTEND_BACKWARDS)) goto nomerge; } + + /* +* We could not extend either amap, just skip on. +*/ + else { + goto nomerge; + } } else { /* * Pull the next entry's amap backwards to cover this
Re: PATCH: Relax fdatasync checks to IEEE Std 1003.1-2008
Thanks! On 2020-05-24, Paul Ripke wrote: > On Sun, May 24, 2020 at 06:56:54AM -0400, Greg Troxel wrote: > > Yorick Hardy writes: > > > > (I realize you later say this isn't it.) > > > > >> @@ -4141,10 +4140,6 @@ sys_fdatasync(struct lwp *l, const struct > > >> sys_fdatasync_args *uap, register_t *r > > >> /* fd_getvnode() will use the descriptor for us */ > > >> if ((error = fd_getvnode(SCARG(uap, fd), )) != 0) > > >> return (error); > > >> -if ((fp->f_flag & FWRITE) == 0) { > > >> -fd_putfile(SCARG(uap, fd)); > > >> -return (EBADF); > > >> -} > > >> vp = fp->f_vnode; > > >> vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); > > >> error = VOP_FSYNC(vp, fp->f_cred, FSYNC_WAIT|FSYNC_DATAONLY, 0, > > >> 0); > > > > If you look at the function beyond what's in the diff, you will see (I > > think, but I really mean I see) that there is always a single > > fd_putfile. This was just doing the put before returning, rather than > > setting error and the usaul "goto out" where the end-of-routine cleanup > > happens. See also sys_fsync_range() in the same file. > > > > I could be reading this wrong. > > I concur - this was just the fd_putfile to match the fd_getfile in > the early error path for read-only files. The code now falls through > and calls fd_putfile regardless, to remove the fd reference. > > There should be no real behaviour change here apart from a relaxing > of the success conditions. > > -- > Paul Ripke > "Great minds discuss ideas, average minds discuss events, small minds > discuss people." > -- Disputed: Often attributed to Eleanor Roosevelt. 1948. -- Kind regards, Yorick Hardy
Re: PATCH: Relax fdatasync checks to IEEE Std 1003.1-2008
On 2020-05-24, Yorick Hardy wrote: > Dear Greg and Paul, > > On 2020-03-25, Paul Ripke wrote: > > On Mon, Mar 16, 2020 at 08:47:27AM -0400, Greg Troxel wrote: > > > [lots of test reports about fdatasync patch] > > > > > > Thanks -- that's enough for me to be comfortable. > > > and it's been proposed for more than long enough, with no adverse > > > comments, so I'll commit it soonish. > > > > fwiw, I missed a comment at the top of the function... fixed in > > attached patch. > > > > -- > > Paul Ripke > > "Great minds discuss ideas, average minds discuss events, small minds > > discuss people." > > -- Disputed: Often attributed to Eleanor Roosevelt. 1948. > > > diff --git a/lib/libc/sys/fdatasync.2 b/lib/libc/sys/fdatasync.2 > > index 3f12119f0dbb..20da609191f5 100644 > > --- a/lib/libc/sys/fdatasync.2 > > +++ b/lib/libc/sys/fdatasync.2 > > @@ -68,7 +68,7 @@ function will fail if: > > .It Bq Er EBADF > > The > > .Fa fd > > -argument is not a valid file descriptor open for writing. > > +argument is not a valid file descriptor. > > .It Bq Er EINVAL > > This implementation does not support synchronized I/O for this file. > > .It Bq Er ENOSYS > > @@ -93,4 +93,4 @@ and outstanding I/O operations are not guaranteed to have > > been completed. > > The > > .Fn fdatasync > > function conforms to > > -.St -p1003.1b-93 . > > +.St -p1003.1-2008 . > > diff --git a/sys/kern/vfs_syscalls.c b/sys/kern/vfs_syscalls.c > > index d51beedbfca9..8cfe5abe6cf8 100644 > > --- a/sys/kern/vfs_syscalls.c > > +++ b/sys/kern/vfs_syscalls.c > > @@ -4059,8 +4059,7 @@ sys_fsync(struct lwp *l, const struct sys_fsync_args > > *uap, register_t *retval) > > * Sync a range of file data. API modeled after that found in AIX. > > * > > * FDATASYNC indicates that we need only save enough metadata to be able > > - * to re-read the written data. Note we duplicate AIX's requirement that > > - * the file be open for writing. > > + * to re-read the written data. > > */ > > /* ARGSUSED */ > > int > > @@ -4141,10 +4140,6 @@ sys_fdatasync(struct lwp *l, const struct > > sys_fdatasync_args *uap, register_t *r > > /* fd_getvnode() will use the descriptor for us */ > > if ((error = fd_getvnode(SCARG(uap, fd), )) != 0) > > return (error); > > - if ((fp->f_flag & FWRITE) == 0) { > > - fd_putfile(SCARG(uap, fd)); > > - return (EBADF); > > - } > > vp = fp->f_vnode; > > vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); > > error = VOP_FSYNC(vp, fp->f_cred, FSYNC_WAIT|FSYNC_DATAONLY, 0, 0); > > On 2020-03-25, Greg Troxel wrote: > > Paul Ripke writes: > > > > > On Mon, Mar 16, 2020 at 08:47:27AM -0400, Greg Troxel wrote: > > >> [lots of test reports about fdatasync patch] > > >> > > >> Thanks -- that's enough for me to be comfortable. > > >> and it's been proposed for more than long enough, with no adverse > > >> comments, so I'll commit it soonish. > > > > > fwiw, I missed a comment at the top of the function... fixed in > > > attached patch. > > > > I have committed your patch, exactly as you just sent it. My full > > release build worked and I have an anita test run in process, just in > > case. > > > > Thanks for perservering on this. It takes many people to fix all the > > loose ends in an operating system! > > I have been trying to find the cause of PR kern/55237. I am not at all > familiar with the code, so please forgive me for pointing fingers! > > I think the call to fd_putfile results in a close of the fd, but that > does not happen anymore? Should it? > > Apologies again if this has nothing to do with kern/55237. It was not the casue of kern/55237, apologies! -- Kind regards, Yorick Hardy
Re: PATCH: Relax fdatasync checks to IEEE Std 1003.1-2008
Dear Greg and Paul, On 2020-03-25, Paul Ripke wrote: > On Mon, Mar 16, 2020 at 08:47:27AM -0400, Greg Troxel wrote: > > [lots of test reports about fdatasync patch] > > > > Thanks -- that's enough for me to be comfortable. > > and it's been proposed for more than long enough, with no adverse > > comments, so I'll commit it soonish. > > fwiw, I missed a comment at the top of the function... fixed in > attached patch. > > -- > Paul Ripke > "Great minds discuss ideas, average minds discuss events, small minds > discuss people." > -- Disputed: Often attributed to Eleanor Roosevelt. 1948. > diff --git a/lib/libc/sys/fdatasync.2 b/lib/libc/sys/fdatasync.2 > index 3f12119f0dbb..20da609191f5 100644 > --- a/lib/libc/sys/fdatasync.2 > +++ b/lib/libc/sys/fdatasync.2 > @@ -68,7 +68,7 @@ function will fail if: > .It Bq Er EBADF > The > .Fa fd > -argument is not a valid file descriptor open for writing. > +argument is not a valid file descriptor. > .It Bq Er EINVAL > This implementation does not support synchronized I/O for this file. > .It Bq Er ENOSYS > @@ -93,4 +93,4 @@ and outstanding I/O operations are not guaranteed to have > been completed. > The > .Fn fdatasync > function conforms to > -.St -p1003.1b-93 . > +.St -p1003.1-2008 . > diff --git a/sys/kern/vfs_syscalls.c b/sys/kern/vfs_syscalls.c > index d51beedbfca9..8cfe5abe6cf8 100644 > --- a/sys/kern/vfs_syscalls.c > +++ b/sys/kern/vfs_syscalls.c > @@ -4059,8 +4059,7 @@ sys_fsync(struct lwp *l, const struct sys_fsync_args > *uap, register_t *retval) > * Sync a range of file data. API modeled after that found in AIX. > * > * FDATASYNC indicates that we need only save enough metadata to be able > - * to re-read the written data. Note we duplicate AIX's requirement that > - * the file be open for writing. > + * to re-read the written data. > */ > /* ARGSUSED */ > int > @@ -4141,10 +4140,6 @@ sys_fdatasync(struct lwp *l, const struct > sys_fdatasync_args *uap, register_t *r > /* fd_getvnode() will use the descriptor for us */ > if ((error = fd_getvnode(SCARG(uap, fd), )) != 0) > return (error); > - if ((fp->f_flag & FWRITE) == 0) { > - fd_putfile(SCARG(uap, fd)); > - return (EBADF); > - } > vp = fp->f_vnode; > vn_lock(vp, LK_EXCLUSIVE | LK_RETRY); > error = VOP_FSYNC(vp, fp->f_cred, FSYNC_WAIT|FSYNC_DATAONLY, 0, 0); On 2020-03-25, Greg Troxel wrote: > Paul Ripke writes: > > > On Mon, Mar 16, 2020 at 08:47:27AM -0400, Greg Troxel wrote: > >> [lots of test reports about fdatasync patch] > >> > >> Thanks -- that's enough for me to be comfortable. > >> and it's been proposed for more than long enough, with no adverse > >> comments, so I'll commit it soonish. > > > fwiw, I missed a comment at the top of the function... fixed in > > attached patch. > > I have committed your patch, exactly as you just sent it. My full > release build worked and I have an anita test run in process, just in > case. > > Thanks for perservering on this. It takes many people to fix all the > loose ends in an operating system! I have been trying to find the cause of PR kern/55237. I am not at all familiar with the code, so please forgive me for pointing fingers! I think the call to fd_putfile results in a close of the fd, but that does not happen anymore? Should it? Apologies again if this has nothing to do with kern/55237. -- Kind regards, Yorick Hardy
Re: Panic: vrelel: bad ref count (9.99.54)
Dear Andrew, On 2020-04-19, Andrew Doran wrote: > Hi Yorick. > > On Sat, Apr 18, 2020 at 11:00:02AM +0200, Yorick Hardy wrote: > > > > I just had the same panic with 9.99.55: > > > > > > Crash version 9.99.55, image version 9.99.55. > > > crash: _kvm_kvatop(0) > > > Kernel compiled without options LOCKDEBUG. > > > System panicked: vrelel: bad ref count > > > Backtrace from time of crash is available. > > > crash> bt > > > _KERNEL_OPT_NAGR() at 0 > > > ?() at 7f7ff7ecf000 > > > sys_reboot() at sys_reboot > > > vpanic() at vpanic+0x181 > > > vtryrele() at vtryrele > > > vcache_dealloc() at vcache_dealloc > > > uvm_unmap_detach() at uvm_unmap_detach+0x76 > > > uvm_unmap1() at uvm_unmap1+0x4e > > > uvm_mremap() at uvm_mremap+0x36b > > > sys_mremap() at sys_mremap+0x68 > > > syscall() at syscall+0x227 > > > --- syscall (number 411) --- > > > 797459842e9a: > > > crash> > > > > The same just happened on 9.99.56 while fetching (POP) mail using mail/fdm. > > Could you file a PR please? If this panics again could you please run the > "dmesg" command in crash and find out what it printed about the vnode? That > would be very useful. > > Thanks, > Andrew I will do so (... perhaps only this weekend). -- Kind regards, Yorick Hardy
Re: Panic: vrelel: bad ref count (9.99.54)
Dea Andrew, On 2020-04-16, Yorick Hardy wrote: > Dear Andrew, > > On 2020-04-08, Yorick Hardy wrote: > > Dear Andrew, > > > > On 2020-04-07, Yorick Hardy wrote: > > > Dear Andrew, > > > > > > On 2020-04-07, Andrew Doran wrote: > > > > Hi Yorick. > > > > > > > > On Mon, Apr 06, 2020 at 11:16:37PM +0200, Yorick Hardy wrote: > > > > > > > > >Crash version 9.99.54, image version 9.99.54. > > > > >crash: _kvm_kvatop(0) > > > > >Kernel compiled without options LOCKDEBUG. > > > > >System panicked: vrelel: bad ref count > > > > >Backtrace from time of crash is available. > > > > >crash> bt > > > > >_KERNEL_OPT_NAGR() at 0 > > > > >?() at 7f7ff7ecf000 > > > > >sys_reboot() at sys_reboot > > > > >vpanic() at vpanic+0x181 > > > > >vtryrele() at vtryrele > > > > >vcache_dealloc() at vcache_dealloc > > > > >uvm_unmap_detach() at uvm_unmap_detach+0x76 > > > > >uvm_unmap1() at uvm_unmap1+0x4e > > > > >uvm_mremap() at uvm_mremap+0x36b > > > > >sys_mremap() at sys_mremap+0x68 > > > > >syscall() at syscall+0x227 > > > > >--- syscall (number 411) --- > > > > >7f0af7842e9a: > > > > > > > > Were you running anything noteworthy at the time? There is a very good > > > > chance that is fixed by revision 1.217 of src/sys/kern/vfs_lookup.c. > > > > > > > > Thanks, > > > > Andrew > > > > > > Thanks! It happens often when running wip/urlwatch, which keeps > > > the disk quite busy. I will try a new kernel as soon as I can free > > > up my computer! > > > > Initial tests are unable to trigger the panic, so it looks promising! > > Let's assume it is fixed, I will continue to use this kernel and report > > back if necessary. > > > > Thanks again! > > I just had the same panic with 9.99.55: > > Crash version 9.99.55, image version 9.99.55. > crash: _kvm_kvatop(0) > Kernel compiled without options LOCKDEBUG. > System panicked: vrelel: bad ref count > Backtrace from time of crash is available. > crash> bt > _KERNEL_OPT_NAGR() at 0 > ?() at 7f7ff7ecf000 > sys_reboot() at sys_reboot > vpanic() at vpanic+0x181 > vtryrele() at vtryrele > vcache_dealloc() at vcache_dealloc > uvm_unmap_detach() at uvm_unmap_detach+0x76 > uvm_unmap1() at uvm_unmap1+0x4e > uvm_mremap() at uvm_mremap+0x36b > sys_mremap() at sys_mremap+0x68 > syscall() at syscall+0x227 > --- syscall (number 411) --- > 797459842e9a: > crash> The same just happened on 9.99.56 while fetching (POP) mail using mail/fdm. I have 9.99.42 running without issues, I have not had the time to bisect further. -- Kind regards, Yorick Hardy
Re: Panic: vrelel: bad ref count (9.99.54)
Dear Andrew, On 2020-04-08, Yorick Hardy wrote: > Dear Andrew, > > On 2020-04-07, Yorick Hardy wrote: > > Dear Andrew, > > > > On 2020-04-07, Andrew Doran wrote: > > > Hi Yorick. > > > > > > On Mon, Apr 06, 2020 at 11:16:37PM +0200, Yorick Hardy wrote: > > > > > > >Crash version 9.99.54, image version 9.99.54. > > > >crash: _kvm_kvatop(0) > > > >Kernel compiled without options LOCKDEBUG. > > > >System panicked: vrelel: bad ref count > > > >Backtrace from time of crash is available. > > > >crash> bt > > > >_KERNEL_OPT_NAGR() at 0 > > > >?() at 7f7ff7ecf000 > > > >sys_reboot() at sys_reboot > > > >vpanic() at vpanic+0x181 > > > >vtryrele() at vtryrele > > > >vcache_dealloc() at vcache_dealloc > > > >uvm_unmap_detach() at uvm_unmap_detach+0x76 > > > >uvm_unmap1() at uvm_unmap1+0x4e > > > >uvm_mremap() at uvm_mremap+0x36b > > > >sys_mremap() at sys_mremap+0x68 > > > >syscall() at syscall+0x227 > > > >--- syscall (number 411) --- > > > >7f0af7842e9a: > > > > > > Were you running anything noteworthy at the time? There is a very good > > > chance that is fixed by revision 1.217 of src/sys/kern/vfs_lookup.c. > > > > > > Thanks, > > > Andrew > > > > Thanks! It happens often when running wip/urlwatch, which keeps > > the disk quite busy. I will try a new kernel as soon as I can free > > up my computer! > > Initial tests are unable to trigger the panic, so it looks promising! > Let's assume it is fixed, I will continue to use this kernel and report > back if necessary. > > Thanks again! I just had the same panic with 9.99.55: Crash version 9.99.55, image version 9.99.55. crash: _kvm_kvatop(0) Kernel compiled without options LOCKDEBUG. System panicked: vrelel: bad ref count Backtrace from time of crash is available. crash> bt _KERNEL_OPT_NAGR() at 0 ?() at 7f7ff7ecf000 sys_reboot() at sys_reboot vpanic() at vpanic+0x181 vtryrele() at vtryrele vcache_dealloc() at vcache_dealloc uvm_unmap_detach() at uvm_unmap_detach+0x76 uvm_unmap1() at uvm_unmap1+0x4e uvm_mremap() at uvm_mremap+0x36b sys_mremap() at sys_mremap+0x68 syscall() at syscall+0x227 --- syscall (number 411) --- 797459842e9a: crash> but I am not sure what caused it. I will update and try again. -- Kind regards, Yorick Hardy
Re: Panic: vrelel: bad ref count (9.99.54)
Dear Andrew, On 2020-04-07, Yorick Hardy wrote: > Dear Andrew, > > On 2020-04-07, Andrew Doran wrote: > > Hi Yorick. > > > > On Mon, Apr 06, 2020 at 11:16:37PM +0200, Yorick Hardy wrote: > > > > >Crash version 9.99.54, image version 9.99.54. > > >crash: _kvm_kvatop(0) > > >Kernel compiled without options LOCKDEBUG. > > >System panicked: vrelel: bad ref count > > >Backtrace from time of crash is available. > > >crash> bt > > >_KERNEL_OPT_NAGR() at 0 > > >?() at 7f7ff7ecf000 > > >sys_reboot() at sys_reboot > > >vpanic() at vpanic+0x181 > > >vtryrele() at vtryrele > > >vcache_dealloc() at vcache_dealloc > > >uvm_unmap_detach() at uvm_unmap_detach+0x76 > > >uvm_unmap1() at uvm_unmap1+0x4e > > >uvm_mremap() at uvm_mremap+0x36b > > >sys_mremap() at sys_mremap+0x68 > > >syscall() at syscall+0x227 > > >--- syscall (number 411) --- > > >7f0af7842e9a: > > > > Were you running anything noteworthy at the time? There is a very good > > chance that is fixed by revision 1.217 of src/sys/kern/vfs_lookup.c. > > > > Thanks, > > Andrew > > Thanks! It happens often when running wip/urlwatch, which keeps > the disk quite busy. I will try a new kernel as soon as I can free > up my computer! Initial tests are unable to trigger the panic, so it looks promising! Let's assume it is fixed, I will continue to use this kernel and report back if necessary. Thanks again! -- Kind regards, Yorick Hardy
Re: Panic: vrelel: bad ref count (9.99.54)
Dear Andrew, On 2020-04-07, Andrew Doran wrote: > Hi Yorick. > > On Mon, Apr 06, 2020 at 11:16:37PM +0200, Yorick Hardy wrote: > > >Crash version 9.99.54, image version 9.99.54. > >crash: _kvm_kvatop(0) > >Kernel compiled without options LOCKDEBUG. > >System panicked: vrelel: bad ref count > >Backtrace from time of crash is available. > >crash> bt > >_KERNEL_OPT_NAGR() at 0 > >?() at 7f7ff7ecf000 > >sys_reboot() at sys_reboot > >vpanic() at vpanic+0x181 > >vtryrele() at vtryrele > >vcache_dealloc() at vcache_dealloc > >uvm_unmap_detach() at uvm_unmap_detach+0x76 > >uvm_unmap1() at uvm_unmap1+0x4e > >uvm_mremap() at uvm_mremap+0x36b > >sys_mremap() at sys_mremap+0x68 > >syscall() at syscall+0x227 > >--- syscall (number 411) --- > >7f0af7842e9a: > > Were you running anything noteworthy at the time? There is a very good > chance that is fixed by revision 1.217 of src/sys/kern/vfs_lookup.c. > > Thanks, > Andrew Thanks! It happens often when running wip/urlwatch, which keeps the disk quite busy. I will try a new kernel as soon as I can free up my computer! -- Kind regards, Yorick Hardy
Panic: vrelel: bad ref count (9.99.54)
Dear current users, Has anyone else seen this? Crash version 9.99.54, image version 9.99.54. crash: _kvm_kvatop(0) Kernel compiled without options LOCKDEBUG. System panicked: vrelel: bad ref count Backtrace from time of crash is available. crash> bt _KERNEL_OPT_NAGR() at 0 ?() at 7f7ff7ecf000 sys_reboot() at sys_reboot vpanic() at vpanic+0x181 vtryrele() at vtryrele vcache_dealloc() at vcache_dealloc uvm_unmap_detach() at uvm_unmap_detach+0x76 uvm_unmap1() at uvm_unmap1+0x4e uvm_mremap() at uvm_mremap+0x36b sys_mremap() at sys_mremap+0x68 syscall() at syscall+0x227 --- syscall (number 411) --- 7f0af7842e9a: crash> -- Kind regards, Yorick Hardy
Re: Audio recording (using ossaudio)
Dear Tetsuya, On 2020-03-20, Tetsuya Isaki wrote: > At Fri, 20 Mar 2020 08:08:37 +0200, > Yorick Hardy wrote: > > It seems to be stuck in select (or poll, I did not check the source) > > in portaudio. > > Yeah, I'm just looking this in this week. > poll()/select() before read() doesn't work correctly now. > I will fix it. > > > Updating audio/portaudio from portaudio-190600.20161030nb1 to > > portaudio-190600.20161030nb2 > > fixes the problem (maybe because of the patch to disable non-blocking I/O > > ?). > > I looked it right now. And it looks bad strategy. > He should have reported it first... > > > Now 44100 MHz does not sound right (I will send the example off-list), but > > 48000 MHz is fine (this is the same behaviour as audiorecord). > > I heard it but unfortunately I don't know expected status. > It sounded like analog noise or environmental noise. > > > Anyway, if you want to record with pure 44100Hz, you need to > set the hardware 44100Hz mode using audiocfg(1) command: > # audiocfg set r slinear_le 16 2 44100 > > On NetBSD7 (or prior), if you record 44100Hz, the kernel set > the hardware 44100Hz, because it was single audio system. > > On NetBSD9 (or later), multiple recorder apps can be run > simultaneously. So even if your single app want to record > 44100Hz, the kernel can not change the hardware frequency. > The kernel converts from the hardware frequency to your > requested frequency (if different). > In-kernel frequency conversion is simple (and fast and small) > than what userland rich apps does (and I personally think that > such rich operation should be done by userland). > > You need to a)change the hardware format (by audiocfg) or > b)record as the hardware format ("audiocfg list" displays) > and convert it by userland rich converter. > > Thanks, > --- > Tetsuya Isaki Thanks! I think Nia mentioned this also, but somehow I did not fully understand the role of audiocfg. -- Kind regards, Yorick Hardy
Re: Audio recording (using ossaudio)
(Oops: forgot to Cc the list.) Dear Tetsuya, On 2020-03-20, Tetsuya Isaki wrote: > At Thu, 19 Mar 2020 21:36:00 +0200, > Yorick Hardy wrote: > > > > ffmpeg4 -f oss -i /dev/audio -channels 1 -sample_rate 48000 > > > > /tmp/test.wav > > > > > > > > is completely garbled and too short. The file also seems to be > > > > 2-channel, > > > > so I think the recording settings are somehow not applied correctly. > > > > > > I rarely use ffmpeg4 but according to ffmpeg4 documents, > > > -channels/-sample_rate are for video and -ac/-ar are for audio? > > > > If I used it correctly, it is for "-f oss" so for the input. Maybe it > > should go before "-i", but if I recall correctly it does not make > > a difference. > > > > I think "-ac" is for the output format (ffmpeg performs appropriate > > conversion). > > I see, sorry for noise. No problem at all! > > > As you said, output file was too short. However ffmpeg4 probably > > > recorded specified period and created small file so that I think > > > you need to look ffmpeg4 at first. > > > > I did not figure out why the file is too short, but there is some > > oss/ffmpeg interaction (maybe due to non-blocking reads?) which causes > > this. > > It's hard to believe that non-blocking read affects. > I've implemented non-blocking i/o carefully too. If you find how > to reproduce the problem, please send PR. I will try again to reproduce the problem with a minimal program. It was not easy to reproduce the ffmpeg problem (I am not sure why, maybe its is a combined pts calculation and non-blocking read problem). > By the way, in ffmpeg-4.2.1/libavdevice/oss_dec.c: > 70 static int audio_read_packet(AVFormatContext *s1, AVPacket *pkt) > 71 { > : > 95 /* subtract time represented by the number of bytes in the audio fif > 96 cur_time -= (bdelay * 100LL) / (s->sample_rate * s->channels); > > I wonder why this calculation doesn't have precision (16bits = 2bytes). > I modified this line but it did not seem to improve. But I didn't > chase more. I am sure the ffmpeg calculation is wrong, but (as you say) changing it does not fix the problem. -- Kind regards, Yorick Hardy
Re: Audio recording (using ossaudio)
Dear Tetsuya, On 2020-03-19, Tetsuya Isaki wrote: > At Sat, 14 Mar 2020 15:05:37 +0200, > Yorick Hardy wrote: > > Re: audacity (earlier in the thread), audacity hangs whenever I try to > > record. > > Would you tell how to reproduce it? First: my pkgsrc is not up to date -- so there could be some other reason for this. To reproduce: 1) install audio/audacity (I have audacity-2.3.3nb2) 2) start audacity and press the record button! and audacity becomes unresponsive (and also does not repaint when uncovered). > Thanks, > --- > Tetsuya Isaki -- Kind regards, Yorick Hardy
Re: Audio recording (using ossaudio)
Dear Tetsuya, On 2020-03-19, Tetsuya Isaki wrote: > At Tue, 10 Mar 2020 20:49:55 +0200, > Yorick Hardy wrote: > > ffmpeg4 -f oss -i /dev/audio -channels 1 -sample_rate 48000 /tmp/test.wav > > > > is completely garbled and too short. The file also seems to be 2-channel, > > so I think the recording settings are somehow not applied correctly. > > I rarely use ffmpeg4 but according to ffmpeg4 documents, > -channels/-sample_rate are for video and -ac/-ar are for audio? If I used it correctly, it is for "-f oss" so for the input. Maybe it should go before "-i", but if I recall correctly it does not make a difference. I think "-ac" is for the output format (ffmpeg performs appropriate conversion). > % /usr/bin/time ffmpeg4 -f oss -t 0:05 -i /dev/audio -channels 1 test1.wav > 5.04 real 0.02 user 0.04 sys > % file test1.wav > test1.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, > stereo 48000 Hz > > % /usr/bin/time ffmpeg4 -f oss -t 0:05 -i /dev/audio -ac 1 test2.wav > 5.04 real 0.04 user 0.02 sys > % file test2.wav > test2.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, > mono 48000 Hz > > % /usr/bin/time audioplay test1.wav > 2.54 real 0.00 user 0.00 sys > % /usr/bin/time audioplay test2.wav > 2.54 real 0.00 user 0.00 sys > > As you said, output file was too short. However ffmpeg4 probably > recorded specified period and created small file so that I think > you need to look ffmpeg4 at first. I did not figure out why the file is too short, but there is some oss/ffmpeg interaction (maybe due to non-blocking reads?) which causes this. I created wip/ffmpeg4-nbsdaudio and nia improved it, and now recording works fine: ffmpeg4 -f nbsdaudio -i /dev/audio /tmp/netbsd.wav records as expected. > Thanks, > --- > Tetsuya Isaki Thank you! -- Kind regards, Yorick Hardy
Re: Audio recording (using ossaudio)
Dear nia, On 2020-03-14, nia wrote: > On Sat, Mar 14, 2020 at 12:20:11AM +0200, Yorick Hardy wrote: > > You are correct. I threw together a NetBSD audio driver based on the oss > > driver, but it had exactly the same problem. Strangely, I have been unable > > to > > reproduce the problem on an old i386 netbook (so far). > > > > I wrote a test program to try and reproduce what ffmpeg is doing, and > > (I am not sure yet) it seems like non-blocking reads is causing the > > distortion. The same test program with blocking reads seems to work > > okay. > > > > I will look into it a bit more, and then report back. > > Right, /dev/audio doesn't support non-blocking I/O. But you're supposed to > do short enough reads and writes that it shouldn't matter. That might be > the cause of the worst of the problems. Oops, I think the man page might need to be updated then. I managed to convince my test program to correctly record with non-blocking I/O (perhaps by accident?) by working a bit differently to ffmpeg, but I am not sure how to adjust ffmpeg in this way. I will try blocking reads (presumably reading blocksize bytes at a time). Re: audacity (earlier in the thread), audacity hangs whenever I try to record. I probably need to update all of my packages - but I am not doing any long builds at the moment due to unpredictable electricity supply! > Do you want to work on this together somewhere? Yes, that would be great! As long as you don't mind someone who is extremely unresponsive most of the week! I have quite a few deadlines in the next week, and will probably ignore most things while I am doing that work. Attached are the patches for my "testing" version of the ffmpeg backend (heavily based on the OSS backend). I am sure it should be renamed to "netbsd", initially I was trying for sun compatibility - but I am not sure that makes sense. -- Kind regards, Yorick Hardy $NetBSD$ --- configure.orig 2019-08-05 21:11:40.0 + +++ configure @@ -2115,6 +2115,7 @@ HEADERS_LIST=" opencv2_core_core_c_h OpenGL_gl3_h poll_h +sys_audioio_h sys_param_h sys_resource_h sys_select_h @@ -3306,6 +3307,8 @@ android_camera_indev_deps="android camer android_camera_indev_extralibs="-landroid -lcamera2ndk -lmediandk" alsa_indev_deps="alsa" alsa_outdev_deps="alsa" +audioio_indev_deps_any="sys_audioio_h" +audioio_outdev_deps_any="sys_audioio_h" avfoundation_indev_deps="avfoundation corevideo coremedia pthreads" avfoundation_indev_suggest="coregraphics applicationservices" avfoundation_indev_extralibs="-framework Foundation" @@ -6461,6 +6464,10 @@ check_headers "dev/bktr/ioctl_meteor.h d check_headers "dev/video/meteor/ioctl_meteor.h dev/video/bktr/ioctl_bt848.h" || check_headers "dev/ic/bt8xx.h" +if check_struct sys/audioio.h audio_info_t play; then +enable_sanitized sys/audioio.h +fi + if check_struct sys/soundcard.h audio_buf_info bytes; then enable_sanitized sys/soundcard.h else patch-doc_indevs.texi Description: TeXInfo document patch-doc_outdevs.texi Description: TeXInfo document $NetBSD$ --- libavdevice/Makefile.orig 2019-08-05 20:52:21.0 + +++ libavdevice/Makefile @@ -15,6 +15,8 @@ OBJS-$(CONFIG_SHARED) OBJS-$(CONFIG_ALSA_INDEV)+= alsa_dec.o alsa.o timefilter.o OBJS-$(CONFIG_ALSA_OUTDEV) += alsa_enc.o alsa.o OBJS-$(CONFIG_ANDROID_CAMERA_INDEV) += android_camera.o +OBJS-$(CONFIG_AUDIOIO_INDEV) += audioio_dec.o audioio.o +OBJS-$(CONFIG_AUDIOIO_OUTDEV)+= audioio_enc.o audioio.o OBJS-$(CONFIG_AVFOUNDATION_INDEV)+= avfoundation.o OBJS-$(CONFIG_BKTR_INDEV)+= bktr.o OBJS-$(CONFIG_CACA_OUTDEV) += caca.o $NetBSD$ --- libavdevice/alldevices.c.orig 2019-08-05 20:52:21.0 + +++ libavdevice/alldevices.c @@ -27,6 +27,8 @@ extern AVInputFormat ff_alsa_demuxer; extern AVOutputFormat ff_alsa_muxer; extern AVInputFormat ff_android_camera_demuxer; +extern AVInputFormat ff_audioio_demuxer; +extern AVOutputFormat ff_audioio_muxer; extern AVInputFormat ff_avfoundation_demuxer; extern AVInputFormat ff_bktr_demuxer; extern AVOutputFormat ff_caca_muxer; --- /dev/null 2020-03-11 11:29:21.727010528 +0200 +++ libavdevice/audioio.c 2020-03-11 11:41:43.710391401 +0200 @@ -0,0 +1,125 @@ +/* + * Sun and NetBSD play and grab interface + * Copyright (c) 2020 Yorick Hardy + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is
Re: current: completely stuck after four minutes of uptime
On 2020-03-15, Chavdar Ivanov wrote: > Hi, > > On Sun, 15 Mar 2020 at 11:07, Chavdar Ivanov wrote: > > > > On Sun, 15 Mar 2020 at 10:29, Thomas Klausner wrote: > > > > > > Hi! > > > > > > I've just upgraded my 9.99.49 kernel from March 12 to today's from an > > > hour ago. > > > > > > After rebooting, the machine got stuck in less than five minutes. > > > > > > No reaction to CTRL-ALT-ESC from the console, no reaction to pressing > > > the power button. > > > > Mine is from > > > > NetBSD 9.99.49 (GENERIC) #1: Sun Mar 15 02:33:56 GMT 2020 > > > > and works just fine; upgraded three machines without any problem. > > I was somewhat quick to conclude that. One of the upgraded machines, a > VirtualBox guest running GENERIC_KASLR, after perhaps 3 hours appeared > stuck and unresponsive; I couldn't get into the debugger either. The > other physical box is still working, though (and rebuilding). > > > > > > > > > Thomas Maybe completely unrelated, but a kernel compiled yesterday hangs while booting for me. After pressing the power button, the debugger stops in "usb_disconnect_port". -- Kind regards, Yorick Hardy
Re: change within last day broke nvmm
Dear Tobias, On 2020-03-15, Tobias Nygren wrote: > Hi, > > This is consistently reproducable while trying to boot Linux on nvmm. > > panic: LIST_INSERT_HEAD 0x88713368 x86/pmap.c:2135 > vpanic() > panic() > pmap_enter_pv() > pmap_ept_enter() > uvm_fault_lower_enter() > uvm_fault_internal() > nvmm_ioctl() > sys_ioctl() > syscall() > > -Tobias I think Maxime would like to be Cc'd on NVMM issues, so I am doing that here. -- Kind regards, Yorick Hardy
Re: Audio recording (using ossaudio)
Dear nia, On 2020-03-14, Yorick Hardy wrote: > On 2020-03-14, Yorick Hardy wrote: > > Dear nia, > > > > On 2020-03-14, nia wrote: > > > On Sat, Mar 14, 2020 at 12:20:11AM +0200, Yorick Hardy wrote: > > > > You are correct. I threw together a NetBSD audio driver based on the oss > > > > driver, but it had exactly the same problem. Strangely, I have been > > > > unable to > > > > reproduce the problem on an old i386 netbook (so far). > > > > > > > > I wrote a test program to try and reproduce what ffmpeg is doing, and > > > > (I am not sure yet) it seems like non-blocking reads is causing the > > > > distortion. The same test program with blocking reads seems to work > > > > okay. > > > > > > > > I will look into it a bit more, and then report back. > > > > > > Right, /dev/audio doesn't support non-blocking I/O. But you're supposed to > > > do short enough reads and writes that it shouldn't matter. That might be > > > the cause of the worst of the problems. > > > > Oops, I think the man page might need to be updated then. > > I managed to convince my test program to correctly record > > with non-blocking I/O (perhaps by accident?) by working > > a bit differently to ffmpeg, but I am not sure how to > > adjust ffmpeg in this way. I will try blocking reads > > (presumably reading blocksize bytes at a time). > > > > Re: audacity (earlier in the thread), audacity hangs whenever I try to > > record. I probably need to update all of my packages - but I am not > > doing any long builds at the moment due to unpredictable electricity > > supply! > > > > > Do you want to work on this together somewhere? > > > > Yes, that would be great! As long as you don't mind someone who is > > extremely unresponsive most of the week! I have quite a few deadlines > > in the next week, and will probably ignore most things while I am > > doing that work. > > > > Attached are the patches for my "testing" version of the ffmpeg > > backend (heavily based on the OSS backend). I am sure it should be > > renamed to "netbsd", initially I was trying for sun compatibility - > > but I am not sure that makes sense. > > Just a side note that I keep forgetting to mention: I used ffmpeg > and oss to record videos over the last few years and "it used to > work fine". I think that the most recent audio changes have > broken some expectations that ffmpeg has, but it used to work > (more or less) as ffmpeg expected. > > That said, I think a netbsd audio backend would be great. I will > create a pkgsrc-wip package in the mean time to start working on > a netbsd audio backend, unless someone beats me to it! > > [wip/ags also has some audio problems, it uses allegro for audio; > but I will start another thread about that one day.] I have imported wip/ffmpeg4-nbsdaudio which, thanks to your comments, now manages some simple recording and playback. I am sure many improvements are need to be made, but I am happy that I can record videos again if I need to! -- Kind regards, Yorick Hardy
Re: Audio recording (using ossaudio)
On 2020-03-14, Yorick Hardy wrote: > Dear nia, > > On 2020-03-14, nia wrote: > > On Sat, Mar 14, 2020 at 12:20:11AM +0200, Yorick Hardy wrote: > > > You are correct. I threw together a NetBSD audio driver based on the oss > > > driver, but it had exactly the same problem. Strangely, I have been > > > unable to > > > reproduce the problem on an old i386 netbook (so far). > > > > > > I wrote a test program to try and reproduce what ffmpeg is doing, and > > > (I am not sure yet) it seems like non-blocking reads is causing the > > > distortion. The same test program with blocking reads seems to work > > > okay. > > > > > > I will look into it a bit more, and then report back. > > > > Right, /dev/audio doesn't support non-blocking I/O. But you're supposed to > > do short enough reads and writes that it shouldn't matter. That might be > > the cause of the worst of the problems. > > Oops, I think the man page might need to be updated then. > I managed to convince my test program to correctly record > with non-blocking I/O (perhaps by accident?) by working > a bit differently to ffmpeg, but I am not sure how to > adjust ffmpeg in this way. I will try blocking reads > (presumably reading blocksize bytes at a time). > > Re: audacity (earlier in the thread), audacity hangs whenever I try to > record. I probably need to update all of my packages - but I am not > doing any long builds at the moment due to unpredictable electricity > supply! > > > Do you want to work on this together somewhere? > > Yes, that would be great! As long as you don't mind someone who is > extremely unresponsive most of the week! I have quite a few deadlines > in the next week, and will probably ignore most things while I am > doing that work. > > Attached are the patches for my "testing" version of the ffmpeg > backend (heavily based on the OSS backend). I am sure it should be > renamed to "netbsd", initially I was trying for sun compatibility - > but I am not sure that makes sense. Just a side note that I keep forgetting to mention: I used ffmpeg and oss to record videos over the last few years and "it used to work fine". I think that the most recent audio changes have broken some expectations that ffmpeg has, but it used to work (more or less) as ffmpeg expected. That said, I think a netbsd audio backend would be great. I will create a pkgsrc-wip package in the mean time to start working on a netbsd audio backend, unless someone beats me to it! [wip/ags also has some audio problems, it uses allegro for audio; but I will start another thread about that one day.] -- Kind regards, Yorick Hardy
Re: Audio recording (using ossaudio)
Dear nia, On 2020-03-13, nia wrote: > On Tue, Mar 10, 2020 at 08:49:55PM +0200, Yorick Hardy wrote: > > Can anyone else record audio correctly via ossaudio? > > audiorecord seems to work as long as the frequency > > divides the native frequency (see dmesg excerpt below) > > (I missed this post, but got contacted about it directly off-list. I'm > probably a good person to contact about this sort of thing). Thanks, I watched your recent talk :-) > The sample rate and number of channels need to exactly match the device > for ideal output. Right, but I think it should not be a problem if it is different (subject to some quality loss) because of the new audio system? > > The ffmpeg OSS code looks very primitive. I might be persuaded to write > a backend that does detection of device characteristics. It's basically > required for proper audio recording on NetBSD. Note that our OSS emulation > doesn't match the spec exactly, and is also undocumented, so doing anything > non-trivial is hard. I don't recommend writing new code that uses it for > that reason. You are correct. I threw together a NetBSD audio driver based on the oss driver, but it had exactly the same problem. Strangely, I have been unable to reproduce the problem on an old i386 netbook (so far). I wrote a test program to try and reproduce what ffmpeg is doing, and (I am not sure yet) it seems like non-blocking reads is causing the distortion. The same test program with blocking reads seems to work okay. I will look into it a bit more, and then report back. > Out of curiosity, does Audacity work for you (when set to single channel > 16-bit PCM, etc - it defaults to 32-bit floats which won't work). I am building it ... my computer is quite old! I will report back once I have tried it out. -- Kind regards, Yorick Hardy
Audio recording (using ossaudio)
Dear current-users, Can anyone else record audio correctly via ossaudio? audiorecord seems to work as long as the frequency divides the native frequency (see dmesg excerpt below) audiorecord -d /dev/audio -c 1 -s 48000 -e slinear_le -P 16 /tmp/test.wav seems to work fine, and audiorecord -d /dev/audio -c 1 -s 44100 -e slinear_le -P 16 /tmp/test.wav is a little noisy (maybe due to sample rate conversion?). But ffmpeg4 -f oss -i /dev/audio -channels 1 -sample_rate 48000 /tmp/test.wav is completely garbled and too short. The file also seems to be 2-channel, so I think the recording settings are somehow not applied correctly. On the other hand, gst-launch-1.0 osssrc \! audio/x-raw,channels=1,format=S16LE,rate=48000 \! wavenc \! filesink location=/tmp/test.wav works fine (I am not sure what it does differently. Apologies for not finding the cause of the issue, I have not had time to investigate. I need ffmpeg recording to work in case I have to make videos for my classes :-) -- Kind regards, Yorick Hardy
Re: Weird qemu-nvmm problem
On 2020-03-09, Chavdar Ivanov wrote: > On Mon, 9 Mar 2020 at 19:52, Yorick Hardy wrote: > > > > Dear Chavdar, > > > > On 2020-03-08, Chavdar Ivanov wrote: > > > Hi, > > > > > > On a -current (from today, but has happened before), when running a > > > particular nvmm guest (32-bit Windows 10), usually when it is busy > > > going through some updates, the host gets into a weird state. I have > > > an ssh connection to it with several tmux panes open; I can switch > > > between them, so the connection to the host is still ok, but in the > > > same time the host does not answer to pings anymore; none of the tmux > > > panes themselves accepts any input, with the exception of the one I am > > > running the qemu client in; I can interrupt it and after that the host > > > comes into normal state. When this happens, I get > > > > > > [ 7444.602404] coretemp3: workqueue busy: updates stopped > > > [ 7474.614306] coretemp0: workqueue busy: updates stopped > > > [ 7474.614306] coretemp1: workqueue busy: updates stopped > > > [ 7474.614306] coretemp2: workqueue busy: updates stopped > > > [ 23591.005414] acpitz4: workqueue busy: updates stopped > > > [ 23591.005414] acpibat1: workqueue busy: updates stopped > > > > > > The machine is not doing anything else at the moment, the temperatures > > > are within the expected range. > > > > > > Any clues? > > > > > > Chavdar > > > > Unfortunately, "me too". But I did not manage to see the logs or > > track down the change which caused this behaviour (sorry). > > At least I can see it is not something specific to my host and installation. > > > > > I had panics for a while (in January?) when using nvmm, and once > > that was fixed the "stalling" behaviour started. > > BTW, the hangs happen with my Windows guests; I tried today a couple > of Linux ones, they ran without a problem. > > I just tried also an OmniOS guest; this one used to work fine on the > 29th of February, when I booted and updated itl today it did not > complete the boot at all and I had to restart the host as it was > unresponsive. That was my experience too (Windows guests hang, but linux guests do not seem to hang). -- Kind regards, Yorick Hardy
Re: Weird qemu-nvmm problem
Dear Chavdar, On 2020-03-08, Chavdar Ivanov wrote: > Hi, > > On a -current (from today, but has happened before), when running a > particular nvmm guest (32-bit Windows 10), usually when it is busy > going through some updates, the host gets into a weird state. I have > an ssh connection to it with several tmux panes open; I can switch > between them, so the connection to the host is still ok, but in the > same time the host does not answer to pings anymore; none of the tmux > panes themselves accepts any input, with the exception of the one I am > running the qemu client in; I can interrupt it and after that the host > comes into normal state. When this happens, I get > > [ 7444.602404] coretemp3: workqueue busy: updates stopped > [ 7474.614306] coretemp0: workqueue busy: updates stopped > [ 7474.614306] coretemp1: workqueue busy: updates stopped > [ 7474.614306] coretemp2: workqueue busy: updates stopped > [ 23591.005414] acpitz4: workqueue busy: updates stopped > [ 23591.005414] acpibat1: workqueue busy: updates stopped > > The machine is not doing anything else at the moment, the temperatures > are within the expected range. > > Any clues? > > Chavdar Unfortunately, "me too". But I did not manage to see the logs or track down the change which caused this behaviour (sorry). I had panics for a while (in January?) when using nvmm, and once that was fixed the "stalling" behaviour started. -- Kind regards, Yorick Hardy
Re: audio panic
Dear Tetsuya, On 2019-11-04, Yorick Hardy wrote: > Dear Tetsuya, > > On 2019-11-02, Tetsuya Isaki wrote: > > At Sat, 26 Oct 2019 18:27:36 +0200, > > Yorick Hardy wrote: > > > [ 166.145911] panic: kernel diagnostic assertion "ring->used + n <= > > > ring->capacity" failed: file "/usr/src/local/sys/dev/audio/audiodef.h", > > > line 406 called from audio_track_record:4518: ring->used=32256 n=32256 > > > ring->capacity=61440 > > > [ 166.145911] cpu3: Begin traceback... > > > [ 166.145911] vpanic() at netbsd:vpanic+0x178 > > > [ 166.145911] kern_assert() at netbsd:kern_assert+0x48 > > > [ 166.155927] audioread() at netbsd:audioread+0xb87 > > > > Can you reproduce this? > > > > Thanks, > > --- > > Tetsuya Isaki > > I should be able to try get it to happen again tomorrow (I managed > to trigger the panic on my work computer only so far). The offending > command is: > > ffplay4 -hide_banner -showmode waves -f oss /dev/audio > > (to test the microphone). I think ffmpeg was reading audio much > slower that the driver was providing it (because of the recording > rate mismatch in our oss which you have kindly fixed). My attempts at reproducing this with audioio did not work. But reverting the libossaudio fixes makes it reproducible with ffplay4 (this is again because ffplay4 reads the audio at 8000Hz instead of 48000Hz). I have a crash dump if that will help (custom kernel): Crash version 9.99.17, image version 9.99.17. System panicked: trap Backtrace from time of crash is available. db> crash> bt _KERNEL_OPT_NAGR() at 0 ?() at b0013f65 vpanic() at vpanic+0x181 snprintf() at snprintf startlwp() at startlwp calltrap() at calltrap+0x11 dofileread() at dofileread+0x8f sys_read() at sys_read+0x49 syscall() at syscall+0x1d8 --- syscall (number 3) --- 79d6aac42b7a: Kind regards, -- Yorick Hardy
Re: audio panic
Dear Tetsuya, On 2019-11-04, Yorick Hardy wrote: > Dear Tetsuya, > > On 2019-11-02, Tetsuya Isaki wrote: > > At Sat, 26 Oct 2019 18:27:36 +0200, > > Yorick Hardy wrote: > > > [ 166.145911] panic: kernel diagnostic assertion "ring->used + n <= > > > ring->capacity" failed: file "/usr/src/local/sys/dev/audio/audiodef.h", > > > line 406 called from audio_track_record:4518: ring->used=32256 n=32256 > > > ring->capacity=61440 > > > [ 166.145911] cpu3: Begin traceback... > > > [ 166.145911] vpanic() at netbsd:vpanic+0x178 > > > [ 166.145911] kern_assert() at netbsd:kern_assert+0x48 > > > [ 166.155927] audioread() at netbsd:audioread+0xb87 > > > > Can you reproduce this? > > > > Thanks, > > --- > > Tetsuya Isaki > > I should be able to try get it to happen again tomorrow (I managed > to trigger the panic on my work computer only so far). The offending > command is: > > ffplay4 -hide_banner -showmode waves -f oss /dev/audio > > (to test the microphone). I think ffmpeg was reading audio much > slower that the driver was providing it (because of the recording > rate mismatch in our oss which you have kindly fixed). I have not been able to reproduce this yet (I am using the fixed libossaudio; I will have to try go back to see if I can trigger the crash with the unfixed version -- I tried to write a small program to reproduce the problem but it works without fail). I have been reading the code around the crash a bit, and I have a question: https://nxr.netbsd.org/xref/src/sys/dev/audio/audiodef.h#116 116 u_int usrbuf_usedhigh;/* high water mark in bytes */ but usrbuf_usedhigh is used as if it is measured in frames? https://nxr.netbsd.org/xref/src/sys/dev/audio/audio.c#4501 4500 count = uimin(count, 4501 (track->usrbuf_usedhigh - usrbuf->used) / framesize); 4502 bytes = count * framesize; (and apparently also throughout the rest of the audio.c). I wonder if this should be: 4500 count = uimin(count, 4501 track->usrbuf_usedhigh - usrbuf->used); I doubt that this is the cause of the panic though. If I may guess: https://nxr.netbsd.org/xref/src/sys/dev/audio/audio.c#4521 https://nxr.netbsd.org/xref/src/sys/dev/audio/audio.c#4525 4521 bytes2 = bytes - bytes1; 4525 auring_push(usrbuf, bytes2); does not check whether bytes2 is small enough? (The panic happened when the software was consuming audio at a much lower rate than it asked for.) Thanks for looking at this! -- Yorick Hardy
Re: audio panic
Dear Tetsuya, On 2019-11-02, Tetsuya Isaki wrote: > At Sat, 26 Oct 2019 18:27:36 +0200, > Yorick Hardy wrote: > > [ 166.145911] panic: kernel diagnostic assertion "ring->used + n <= > > ring->capacity" failed: file "/usr/src/local/sys/dev/audio/audiodef.h", > > line 406 called from audio_track_record:4518: ring->used=32256 n=32256 > > ring->capacity=61440 > > [ 166.145911] cpu3: Begin traceback... > > [ 166.145911] vpanic() at netbsd:vpanic+0x178 > > [ 166.145911] kern_assert() at netbsd:kern_assert+0x48 > > [ 166.155927] audioread() at netbsd:audioread+0xb87 > > Can you reproduce this? > > Thanks, > --- > Tetsuya Isaki I should be able to try get it to happen again tomorrow (I managed to trigger the panic on my work computer only so far). The offending command is: ffplay4 -hide_banner -showmode waves -f oss /dev/audio (to test the microphone). I think ffmpeg was reading audio much slower that the driver was providing it (because of the recording rate mismatch in our oss which you have kindly fixed). -- Kind regards, Yorick Hardy
Re: X11 - WindowMaker characters issues after update
Dear Riccardo, On 2019-11-03, Riccardo Mottola wrote: > Hi, > > I upgraded NetBSD current and then upgraded all pkgsrc packages with > pkg_rolling-replace. > After two attempts of prr, everything upgraded except rust (which continutes > to hang/fail). > > X11 works, but I use WindowMaker and it is unusable now (and WPrefs > neither). All displayed characters are replaced with rectangles with small > numbers (like when special characters cannot be displayed): everything, > menus, window titles, etc. Unreadable. > > Interestingly, other X11 apps work: both xterm, xedit work. Also SeaMonkey > and gVim. > > Below I paste ld of wmaker: > > /usr/pkg/bin/wmaker: > -lWINGs.3 => /usr/pkg/lib/libWINGs.so.3 > -lWUtil.5 => /usr/pkg/lib/libWUtil.so.5 > -lkvm.6 => /usr/lib/libkvm.so.6 > -lc.12 => /usr/lib/libc.so.12 > -lintl.1 => /usr/lib/libintl.so.1 > -lwraster.6 => /usr/pkg/lib/libwraster.so.6 > -lXpm.5 => /usr/X11R7/lib/libXpm.so.5 > -lXext.7 => /usr/X11R7/lib/libXext.so.7 > -lX11.7 => /usr/X11R7/lib/libX11.so.7 > -lxcb.2 => /usr/X11R7/lib/libxcb.so.2 > -lXau.7 => /usr/X11R7/lib/libXau.so.7 > -lXdmcp.7 => /usr/X11R7/lib/libXdmcp.so.7 > -lpng16.16 => /usr/pkg/lib/libpng16.so.16 > -lz.1 => /usr/lib/libz.so.1 > -lm.0 => /usr/lib/libm.so.0 > -lgif.7 => /usr/pkg/lib/libgif.so.7 > -ltiff.5 => /usr/pkg/lib/libtiff.so.5 > -llzma.2 => /usr/lib/liblzma.so.2 > -lpthread.1 => /usr/lib/libpthread.so.1 > -ljbig.2 => /usr/pkg/lib/libjbig.so.2 > -ljpeg.9 => /usr/pkg/lib/libjpeg.so.9 > -lwebp.7 => /usr/pkg/lib/libwebp.so.7 > -lXmu.7 => /usr/X11R7/lib/libXmu.so.7 > -lXt.7 => /usr/X11R7/lib/libXt.so.7 > -lSM.7 => /usr/X11R7/lib/libSM.so.7 > -lICE.7 => /usr/X11R7/lib/libICE.so.7 > -lpangoxft-1.0.0 => /usr/pkg/lib/libpangoxft-1.0.so.0 > -lpango-1.0.0 => /usr/pkg/lib/libpango-1.0.so.0 > -lglib-2.0.0 => /usr/pkg/lib/libglib-2.0.so.0 > -lpcre.1 => /usr/pkg/lib/libpcre.so.1 > -lgobject-2.0.0 => /usr/pkg/lib/libgobject-2.0.so.0 > -lffi.6 => /usr/pkg/lib/libffi.so.6 > -lfribidi.0 => /usr/pkg/lib/libfribidi.so.0 > -lharfbuzz.0 => /usr/pkg/lib/libharfbuzz.so.0 > -lfreetype.19 => /usr/X11R7/lib/libfreetype.so.19 > -lbz2.1 => /usr/lib/libbz2.so.1 > -lgraphite2.3 => /usr/pkg/lib/libgraphite2.so.3 > -lstdc++.9 => /usr/lib/libstdc++.so.9 > -lgcc_s.1 => /usr/lib/libgcc_s.so.1 > -lpangoft2-1.0.0 => /usr/pkg/lib/libpangoft2-1.0.so.0 > -lfontconfig.2 => /usr/X11R7/lib/libfontconfig.so.2 > -lexpat.2 => /usr/lib/libexpat.so.2 > -lXrender.2 => /usr/X11R7/lib/libXrender.so.2 > -lXft.3 => /usr/X11R7/lib/libXft.so.3 > -lXrandr.3 => /usr/X11R7/lib/libXrandr.so.3 > -lXinerama.2 => /usr/X11R7/lib/libXinerama.so.2 This might be more pango fallout: https://blogs.gnome.org/mclasen/2019/05/25/pango-future-directions/ "Using Harfbuzz for font loading means that we will lose support for bitmap and type1 fonts. We think this is an acceptable trade-off, but others might disagree. Note that Harfbuzz does support loading bitmap-only OpenType fonts." Kind regards, -- Kind regards, Yorick Hardy
audio panic
Dear current-users, When recording with a recently updated kernel, I experienced some kernel panics. I think this has been the case for some time now - but I only got around to reporting the panics now (apologies). I captured two panics in my dmesg, I guess the second is most relevant. I have also managed to record audio using ffmeg4 without a panic, but the audio plays back too slow and is a bit garbled (I was unable to diagnose the quality problems - but the sampling speed was much higher than specified). Any tips would be appreciated. uname: NetBSD HOME 9.99.17 NetBSD 9.99.17 (YORICK.amd64) #1: Sat Oct 26 00:06:47 SAST 2019 root@HOME:/root/build.amd64.local/obj/sys/arch/amd64/compile/YORICK.amd64 amd64 dmesg: ... [ 1.052454] hdaudio0 at pci0 dev 31 function 3: HD Audio Controller [ 1.052454] hdaudio0: interrupting at msi3 vec 0 [ 1.052454] hdafg0 at hdaudio0: vendor 14f1 product 50f4 [ 1.052454] hdafg0: DAC00 2ch: Speaker [Built-In] [ 1.052454] hdafg0: ADC01 2ch: Mic In [Built-In] [ 1.052454] hdafg0: DAC03 2ch: HP Out [Jack] [ 1.052454] hdafg0: 2ch/2ch 48000Hz 96000Hz PCM16 PCM24 AC3 [ 1.052454] audio0 at hdafg0: playback, capture, full duplex, independent [ 1.052454] audio0: slinear_le:16 2ch 48000Hz, blk 40ms for playback [ 1.052454] audio0: slinear_le:16 2ch 48000Hz, blk 40ms for recording [ 1.052454] spkr0 at audio0: PC Speaker (synthesized) [ 1.052454] wsbell at spkr0 not configured [ 1.052454] hdafg1 at hdaudio0: vendor 8086 product 2809 [ 1.052454] hdafg1: DP00 8ch: Digital Out [Jack] [ 1.052454] hdafg1: 8ch/0ch 48000Hz PCM16* ... [ 253.893572] uvm_fault(0x810c3160, 0xc8002035a000, 1) -> e [ 253.893572] fatal page fault in supervisor mode [ 253.893572] trap type 6 code 0 rip 0x80b090df cs 0x8 rflags 0x10246 cr2 0xc8002035a000 ilevel 0 rsp 0xc8013f042e78 [ 253.893572] curlwp 0x80a4e46150e0 pid 2801.4 lowest kstack 0xc8013f0402c0 [ 253.893572] panic: trap [ 253.893572] cpu1: Begin traceback... [ 253.893572] vpanic() at netbsd:vpanic+0x178 [ 253.893572] snprintf() at netbsd:snprintf [ 253.893572] startlwp() at netbsd:startlwp [ 253.893572] alltraps() at netbsd:alltraps+0xbb [ 253.893572] dofileread() at netbsd:dofileread+0x8f [ 253.903584] sys_read() at netbsd:sys_read+0x49 [ 253.903584] syscall() at netbsd:syscall+0x1d8 [ 253.903584] --- syscall (number 3) --- [ 253.903584] 7e3IV, PNP0C14-2): ACPI WMI Interface ... [ 166.145911] panic: kernel diagnostic assertion "ring->used + n <= ring->capacity" failed: file "/usr/src/local/sys/dev/audio/audiodef.h", line 406 called from audio_track_record:4518: ring->used=32256 n=32256 ring->capacity=61440 [ 166.145911] cpu3: Begin traceback... [ 166.145911] vpanic() at netbsd:vpanic+0x178 [ 166.145911] kern_assert() at netbsd:kern_assert+0x48 [ 166.155927] audioread() at netbsd:audioread+0xb87 [ 166.155927] dofileread() at netbsd:dofileread+0x8f [ 166.155927] sys_read() at netbsd:sys_read+0x49 [ 166.155927] syscall() at netbsd:syscall+0x211 [ 166.155927] --- syscall (number 3) --- [ 166.155927] 7047c3c42b7a: [ 166.155927] cpu3: End traceback... -- Kind regards, Yorick Hardy
Re: Mesa update
On 2019-04-18, co...@sdf.org wrote: > LD_PRELOAD=/usr/lib/libpthread.so fixes it. > It's the libc stubs. I don't know what to link against libpthread but > pkgsrc is cheating by having glmark2 linked with libpthread. Thank you for updating Mesa, it is much appreciated. SDL2 applications which load libGL.so seem to fail due to missing symbols, I had to add some dependencies as below. Is this correct? -- Kind regards, Yorick Hardy Index: external/mit/xorg/lib/Makefile === RCS file: /cvsroot/src/external/mit/xorg/lib/Makefile,v retrieving revision 1.49 diff -u -u -r1.49 Makefile --- external/mit/xorg/lib/Makefile 16 Apr 2019 21:20:51 - 1.49 +++ external/mit/xorg/lib/Makefile 22 Apr 2019 06:43:24 - @@ -20,10 +20,12 @@ .endif SUBDIR+=libxcb \ .WAIT +SUBDIR+=libX11 \ + .WAIT .if !defined(MLIBDIR) SUBDIR+=${EXTRA_DRI_DIRS} dri${OLD_PREFIX} gallium${OLD_PREFIX} .endif -SUBDIR+=fontconfig libSM libX11 \ +SUBDIR+=fontconfig libSM \ .WAIT \ libXcomposite libXdamage libXext libXfixes libXt \ libxkbfile libepoxy \ Index: external/mit/xorg/lib/gallium/Makefile === RCS file: /cvsroot/src/external/mit/xorg/lib/gallium/Makefile,v retrieving revision 1.25 diff -u -u -r1.25 Makefile --- external/mit/xorg/lib/gallium/Makefile 16 Apr 2019 17:29:09 - 1.25 +++ external/mit/xorg/lib/gallium/Makefile 22 Apr 2019 06:43:24 - @@ -957,6 +957,9 @@ LIBDPLIBS+=terminfo${.CURDIR}/../../../../../lib/libterminfo LIBDPLIBS+=z ${.CURDIR}/../../../../../lib/libz LIBDPLIBS+=execinfo${.CURDIR}/../../../../../lib/libexecinfo +LIBDPLIBS+=xcb ${.CURDIR}/../libxcb/libxcb +LIBDPLIBS+=xcb-dri2${.CURDIR}/../libxcb/dri2 +LIBDPLIBS+=X11-xcb ${.CURDIR}/../libX11/libX11-xcb # gallium drivers requiring LLVM .if ${BUILD_LLVMPIPE} == 1 || ${BUILD_RADEON} == 1
Re: CVS commit: src/sys/external/bsd/drm2
On 2017-03-01, Martin Husemann wrote: > On Wed, Mar 01, 2017 at 06:55:24PM +0900, Kimihiro Nonaka wrote: > > Updated the patch. > > Still works fine for me! > > Martin Also works fine for me on i386+intel and amd64+radeon. Thanks! -- Kind regards, Yorick Hardy
Re: CVS commit: src/sys/external/bsd/drm2
On 2017-02-28, Kimihiro Nonaka wrote: > Hi, > > 2017-02-28 4:10 GMT+09:00 Martin Husemann <mar...@duskware.de>: > > > On Mon, Feb 27, 2017 at 07:39:49PM +0100, Martin Husemann wrote: > >> On Mon, Feb 27, 2017 at 08:29:10PM +0200, Yorick Hardy wrote: > >> > Is anyone else experiencing GPU hangs since revision 1.14 > >> > of src/sys/external/bsd/drm2/pci/drm_pci.c ? > >> > >> Thanks for the hint, I'm testing if it is the cause for > >> > >> http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=51997 > >> > >> and will report back ASAP... > > > > Yes, that seems to be the case. > > I've reverted. Thank you! It is not obvious to me why the MSI changes were problematic. Do you have any ideas what went wrong? -- Kind regards, Yorick Hardy
Re: CVS commit: src/sys/external/bsd/drm2
ndor 8086 product 27c5 (rev. 0x02) ahcisata0: interrupting at ioapic0 pin 17 ahcisata0: AHCI revision 1.10, 4 ports, 32 slots, CAP 0xdf10ff03<PSC,SSC,PMD,ISS=0x1=Gen1,SCLO,SAL,SALP,SSS,SMPS,SNCQ,S64A> atabus1 at ahcisata0 channel 0 atabus2 at ahcisata0 channel 2 ichsmb0 at pci0 dev 31 function 3: vendor 8086 product 27da (rev. 0x02) ichsmb0: interrupting at ioapic0 pin 17 iic0 at ichsmb0: I2C bus isa0 at ichlpcib0 acpicpu0 at cpu0: ACPI CPU acpicpu0: C1: FFH, lat 1 us, pow 1000 mW acpicpu0: C2: I/O, lat 1 us, pow 500 mW acpicpu0: C3: I/O, lat 57 us, pow 100 mW acpicpu0: P0: FFH, lat 10 us, pow 2000 mW, 1600 MHz acpicpu0: P1: FFH, lat 10 us, pow 1533 mW, 1333 MHz acpicpu0: P2: FFH, lat 10 us, pow 1066 mW, 1066 MHz acpicpu0: P3: FFH, lat 10 us, pow 600 mW, 800 MHz acpicpu0: T0: I/O, lat 1 us, pow 0 mW, 100 % acpicpu0: T1: I/O, lat 1 us, pow 0 mW, 88 % acpicpu0: T2: I/O, lat 1 us, pow 0 mW, 76 % acpicpu0: T3: I/O, lat 1 us, pow 0 mW, 64 % acpicpu0: T4: I/O, lat 1 us, pow 0 mW, 52 % acpicpu0: T5: I/O, lat 1 us, pow 0 mW, 40 % acpicpu0: T6: I/O, lat 1 us, pow 0 mW, 28 % acpicpu0: T7: I/O, lat 1 us, pow 0 mW, 16 % coretemp0 at cpu0: thermal sensor, 1 C resolution, Tjmax=100 acpicpu1 at cpu1: ACPI CPU DRM error in i915_irq_handler: pipe A underrun timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0 acpiacad0: AC adapter offline. IPsec: Initialized Security Association Processing. uhub0 at usb0: vendor 8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhub1 at usb1: vendor 8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhub2 at usb2: vendor 8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub2: 2 ports with 2 removable, self powered uhub3 at usb3: vendor 8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub3: 2 ports with 2 removable, self powered uhub4 at usb4: vendor 8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub4: 8 ports with 8 removable, self powered ahcisata0 port 0: device present, speed: 1.5Gb/s wd0 at atabus1 drive 0 wd0: wd0: drive supports 16-sector PIO transfers, LBA48 addressing wd0: 232 GB, 484521 cyl, 16 head, 63 sec, 512 bytes/sect x 488397168 sectors wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100) wd0(ahcisata0:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100) (using DMA) uvideo0 at uhub4 port 5 configuration 1 interface 0: BISON Corporation LG, rev 2.00/18.02, addr 2 video0 at uvideo0: BISON Corporation LG, rev 2.00/18.02, addr 2 WARNING: 2 errors while detecting hardware; check system log. boot device: wd0 root on wd0a dumps on wd0b root file system type: ffs kern.module.path=/stand/i386/7.99.62/modules ral0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps ral0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps ubt0 at uhub1 port 1 ubt0: Broadcom Corp Broadcom Bluetooth 2.1 Device, rev 2.00/0.72, addr 2 wsdisplay0: screen 1 added (default, vt100 emulation) wsdisplay0: screen 2 added (default, vt100 emulation) wsdisplay0: screen 3 added (default, vt100 emulation) wsdisplay0: screen 4 added (default, vt100 emulation) -- Kind regards, Yorick Hardy
Re: XOrg oriented build failure
On 2017-01-04, bch wrote: > On Jan 4, 2017 21:37, "Martin Husemann" <mar...@duskware.de> wrote: > > On Wed, Jan 04, 2017 at 09:29:05PM -0800, bch wrote: > > transform.o: In function `map_to_output': > > transform.c:(.text+0x2a3): undefined reference to `xi2_find_device_info' > > collect2: error: ld returned 1 exit status > > Clean the xinput obj dir and retry (this needs another UPDATING entry). > > > I tried that (given our exchange earlier today), but I'll try again in case > some cruft was missed. > > Cheers, That symbol is defined in xsrc/external/mit/xinput/dist/src/xinput.c, but is protected by a "#ifdef HAVE_XI2". Did you clear out obj/external/mit/xorg/bin/xinput? (It seems yes from your answer above.) If it is broken, then I am responsible and will try to fix it. -- Kind regards, Yorick Hardy
Illegal instruction in libcrypto.so
Dear all, I frequently encounter an illegal instruction when using SSL with python2.7 and wip/rawdog. $ uname -a NetBSD HOME 7.99.53 NetBSD 7.99.53 (YORICK.amd64) #0: Sun Jan 1 16:42:47 SAST 2017 root@HOME:/root/build.amd64.local/obj/sys/arch/amd64/compile/YORICK.amd64 amd64 It seems OpenSSL is using instructions which are not available on my CPU (no known problems on 7-STABLE). The last few entries of the the backtrace are: Core was generated by `python2.7'. Program terminated with signal SIGILL, Illegal instruction. #0 0x71df7e9735ca in bn_GF2m_mul_2x2 () from /usr/lib/libcrypto.so.12 [Current thread is 1 (LWP 1)] (gdb) bt #0 0x71df7e9735ca in bn_GF2m_mul_2x2 () from /usr/lib/libcrypto.so.12 #1 0x71df7e96ebb7 in BN_GF2m_mod_mul_arr () from /usr/lib/libcrypto.so.12 #2 0x71df7e968cc9 in ec_GF2m_simple_is_on_curve () from /usr/lib/libcrypto.so.12 #3 0x71df7e927733 in ec_GF2m_simple_oct2point () from /usr/lib/libcrypto.so.12 #4 0x71df7ee45813 in ssl3_get_key_exchange () from /usr/lib/libssl.so.12 #5 0x71df7ee46789 in ssl3_connect () from /usr/lib/libssl.so.12 #6 0x71df7cc0bdbe in PySSL_SSLdo_handshake () from /usr/pkg/lib/python2.7/lib-dynload/_ssl.so #7 0x71df842d39bd in PyEval_EvalFrameEx () from /usr/pkg/lib/libpython2.7.so.1.0 Dump of assembler code for function bn_GF2m_mul_2x2: 0x71df7e9735a0 <+0>: lea0x2b30f5(%rip),%rax# 0x71df7ec2669c 0x71df7e9735a7 <+7>: bt $0x21,%rax 0x71df7e9735ac <+12>:jae0x71df7e973610 <bn_GF2m_mul_2x2+112> 0x71df7e9735ae <+14>:movq %rsi,%xmm0 0x71df7e9735b3 <+19>:movq %rcx,%xmm1 0x71df7e9735b8 <+24>:movq %rdx,%xmm2 0x71df7e9735bd <+29>:movq %r8,%xmm3 0x71df7e9735c2 <+34>:movdqa %xmm0,%xmm4 0x71df7e9735c6 <+38>:movdqa %xmm1,%xmm5 => 0x71df7e9735ca <+42>:pclmullqlqdq %xmm1,%xmm0 0x71df7e9735d0 <+48>:pxor %xmm2,%xmm4 0x71df7e9735d4 <+52>:pxor %xmm3,%xmm5 0x71df7e9735d8 <+56>:pclmullqlqdq %xmm3,%xmm2 0x71df7e9735de <+62>:pclmullqlqdq %xmm5,%xmm4 The CPU features clearly omit PCLMULQDQ: $ cpuctl identify 0 cpu0: highest basic info 0005 cpu0: highest extended info 801b cpu0: "AMD Athlon(tm) II X3 450 Processor" cpu0: AMD Family 10h (686-class), 3200.27 MHz cpu0: family 0x10 model 0x5 stepping 0x3 (id 0x100f53) cpu0: features 0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE> cpu0: features 0x178bfbff<MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT> cpu0: features1 0x802009<SSE3,MONITOR,CX16,POPCNT> cpu0: features2 0xefd3fbff cpu0: features2 0xefd3fbff<3DNOW2,3DNOW> cpu0: features3 0x37ff<LAHF,CMPLEGACY,SVM,EAPIC,ALTMOVCR0,LZCNT,SSE4A> cpu0: features3 0x37ff<MISALIGNSSE,3DNOWPREFETCH,OSVW,IBS,SKINIT,WDT> cpu0: I-cache 64KB 64B/line 2-way, D-cache 64KB 64B/line 2-way cpu0: L2 cache 512KB 64B/line 16-way cpu0: ITLB 32 4KB entries fully associative, 16 2MB entries fully associative cpu0: DTLB 48 4KB entries fully associative, 48 2MB entries fully associative cpu0: L2 ITLB 512 4KB entries 4-way cpu0: L2 DTLB 512 4KB entries 4-way, 128 2MB entries 2-way cpu0: L1 1GB page DTLB 48 1GB entries fully associative cpu0: L2 1GB page DTLB 16 1GB entries 8-way cpu0: Initial APIC ID 0 cpu0: AMD Power Management features: 0x1f9<TS,TTP,HTC,STC,100,HWP,TSC> cpu0: SVM Rev. 1 cpu0: SVM NASID 64 cpu0: SVM features 0xf<NP,LbrVirt,SVML,NRIPS> cpu0: UCode version: 0x1c8 A wild guess is that this change is involved http://cvsweb.netbsd.org/bsdweb.cgi/src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gf2m.pl.diff?r1=1.3=1.4 but I don't understand the change. Any ideas what could be wrong? -- Kind regards, Yorick Hardy
Re: Fix for kern/51772 breaks linking multi-config kernels?
Dear Christos, On 2017-01-04, Christos Zoulas wrote: > In article <20170104195823.GD26839@HOME>, > Yorick Hardy <yorickha...@gmail.com> wrote: > >Dear Martin, > > > >On 2017-01-04, Martin Husemann wrote: > >> Can't you just use swap${.TARGET}.c instead of the wildcard? > >> > >> Martin > > > >I don't think so, because then we pickup more than one swap*.o > >when linking (and redefinition of symbols). > > > >Or did I misunderstand? (I assumed you meant swap${.TARGET}.o). > > > >I think I have a working patch, but I think we can do better (i.e. less > >assumptions about filenames): > > > >Index: sys/conf/Makefile.kern.inc > >=== > >RCS file: /cvsroot/src/sys/conf/Makefile.kern.inc,v > >retrieving revision 1.251 > >diff -u -r1.251 Makefile.kern.inc [snip] > I thought we wanted to match the M and N modifiers so we select and deselect > the same files? That was my thought too! But when multiple kernels are configured we get multiple swap*.o files with duplicate symbols. How about the following? This patch removes all the wildcards, removes the swap*.o files for all configured kernels and includes the swap*.o file for the current target. Did I miss anything? -- Kind regards, Yorick Hardy Index: sys/conf/Makefile.kern.inc === RCS file: /cvsroot/src/sys/conf/Makefile.kern.inc,v retrieving revision 1.252 diff -u -r1.252 Makefile.kern.inc --- sys/conf/Makefile.kern.inc 4 Jan 2017 19:55:06 - 1.252 +++ sys/conf/Makefile.kern.inc 4 Jan 2017 20:26:50 - @@ -203,6 +203,10 @@ SYSTEM_LIB=${MD_LIBS} ${SYSLIBCOMPAT} ${LIBKERN} SYSTEM_OBJ?= ${_MD_OBJS} ${OBJS} ${SYSTEM_LIB} +REMOVE_SWAP= [@] +.for k in ${KERNELS} +REMOVE_SWAP:= ${REMOVE_SWAP}:Nswap${k}.o +.endfor SYSTEM_DEP+= Makefile ${SYSTEM_OBJ:O} .if defined(CTFMERGE) SYSTEM_CTFMERGE= ${CTFMERGE} ${CTFMFLAGS} -o ${.TARGET} ${SYSTEM_OBJ} ${EXTRA_OBJ} vers.o @@ -213,11 +217,11 @@ SYSTEM_LD?=${_MKSHMSG} " link ${.CURDIR:T}/${.TARGET}"; \ ${_MKSHECHO}\ ${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \ - '$${SYSTEM_OBJ:N*swap${.TARGET}.o}' '$${EXTRA_OBJ}' vers.o \ - ${OBJS:M*swap${.TARGET}.o}; \ + '$${SYSTEM_OBJ:${REMOVE_SWAP}}' '$${EXTRA_OBJ}' vers.o \ + ${OBJS:Mswap${.TARGET}.o}; \ ${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \ - ${SYSTEM_OBJ:N*swap${.TARGET}.o} ${EXTRA_OBJ} vers.o \ - ${OBJS:M*swap${.TARGET}.o} + ${SYSTEM_OBJ:${REMOVE_SWAP}} ${EXTRA_OBJ} vers.o \ + ${OBJS:Mswap${.TARGET}.o} TEXTADDR?= ${LOADADDRESS} # backwards compatibility LINKTEXT?= ${TEXTADDR:C/.+/-Ttext &/}
Re: Fix for kern/51772 breaks linking multi-config kernels?
Dear Martin, On 2017-01-04, Martin Husemann wrote: > Can't you just use swap${.TARGET}.c instead of the wildcard? > > Martin I don't think so, because then we pickup more than one swap*.o when linking (and redefinition of symbols). Or did I misunderstand? (I assumed you meant swap${.TARGET}.o). I think I have a working patch, but I think we can do better (i.e. less assumptions about filenames): Index: sys/conf/Makefile.kern.inc === RCS file: /cvsroot/src/sys/conf/Makefile.kern.inc,v retrieving revision 1.251 diff -u -r1.251 Makefile.kern.inc --- sys/conf/Makefile.kern.inc 4 Jan 2017 15:43:04 - 1.251 +++ sys/conf/Makefile.kern.inc 4 Jan 2017 18:12:27 - @@ -213,10 +213,10 @@ SYSTEM_LD?=${_MKSHMSG} " link ${.CURDIR:T}/${.TARGET}"; \ ${_MKSHECHO}\ ${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \ - '$${SYSTEM_OBJ:N*swap*${.TARGET}*}' '$${EXTRA_OBJ}' vers.o \ + '$${SYSTEM_OBJ:Nswap*}' '$${EXTRA_OBJ}' vers.o \ ${OBJS:M*swap${.TARGET}.o}; \ ${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \ - ${SYSTEM_OBJ:N*swap*${.TARGET}*} ${EXTRA_OBJ} vers.o \ + ${SYSTEM_OBJ:Nswap*} ${EXTRA_OBJ} vers.o \ ${OBJS:M*swap${.TARGET}.o} TEXTADDR?= ${LOADADDRESS} # backwards compatibility -- Kind regards, Yorick Hardy
Re: Fix for kern/51772 breaks linking multi-config kernels?
On 2017-01-04, Yorick Hardy wrote: > On 2017-01-04, Yorick Hardy wrote: > > Dear Martin, > > > > On 2017-01-04, Martin Husemann wrote: > > > On Wed, Jan 04, 2017 at 07:28:19PM +0200, Yorick Hardy wrote: > > > > Apologies, my "fix" broke your build. I wonder why it worked before, > > > > probably becuase you have "netbsd" as part of your kernel name? > > > > > > > > Maybe the change should be reverted until the correct solution is found. > > > > > > Just a side note: we have several evb* configs that use multiple config > > > statements, for example sys/arch/evbmips/conf/ZYXELKX: > > > > > > config netbsd root on ? type ? > > > config netbsd-sd0a root on sd0a type ffs dumps none > > > config netbsd-reth0 root on reth0 type nfs dumps none > > > > > > > > > Martin > > > > I am to blame for testing only a very simple configuration! > > > > But it still seems wrong... > > > > Is it correct that ${SYSTEM_OBJ:N*swap*netbsd*} is only for > > configurations named "netbsd*" - are other names allowed? > > > > Maybe we should just use ${SYSTEM_OBJ:N*swap*}, but I still > > need to check whether any other object files match this pattern. > > The suggested pattern will not work. > > Perhaps I am the only person with a kernel not named "netbsd*"! > I suggest reverting the change until we find a solution, with my > apologies. I am testing the patch below, it worked for my kernel which is not named "netbsd*"; I am now testing ZYXELKX. My machine is not the fastest, so it will take a while. Are there any other possible object files beginning with swap* ? -- Kind regards, Yorick Hardy Index: sys/conf/Makefile.kern.inc === RCS file: /cvsroot/src/sys/conf/Makefile.kern.inc,v retrieving revision 1.251 diff -u -r1.251 Makefile.kern.inc --- sys/conf/Makefile.kern.inc 4 Jan 2017 15:43:04 - 1.251 +++ sys/conf/Makefile.kern.inc 4 Jan 2017 18:12:27 - @@ -213,10 +213,10 @@ SYSTEM_LD?=${_MKSHMSG} " link ${.CURDIR:T}/${.TARGET}"; \ ${_MKSHECHO}\ ${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \ - '$${SYSTEM_OBJ:N*swap*${.TARGET}*}' '$${EXTRA_OBJ}' vers.o \ + '$${SYSTEM_OBJ:Nswap*}' '$${EXTRA_OBJ}' vers.o \ ${OBJS:M*swap${.TARGET}.o}; \ ${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \ - ${SYSTEM_OBJ:N*swap*${.TARGET}*} ${EXTRA_OBJ} vers.o \ + ${SYSTEM_OBJ:Nswap*} ${EXTRA_OBJ} vers.o \ ${OBJS:M*swap${.TARGET}.o} TEXTADDR?= ${LOADADDRESS} # backwards compatibility
Re: Fix for kern/51772 breaks linking multi-config kernels?
On 2017-01-04, Yorick Hardy wrote: > Dear Martin, > > On 2017-01-04, Martin Husemann wrote: > > On Wed, Jan 04, 2017 at 07:28:19PM +0200, Yorick Hardy wrote: > > > Apologies, my "fix" broke your build. I wonder why it worked before, > > > probably becuase you have "netbsd" as part of your kernel name? > > > > > > Maybe the change should be reverted until the correct solution is found. > > > > Just a side note: we have several evb* configs that use multiple config > > statements, for example sys/arch/evbmips/conf/ZYXELKX: > > > > config netbsd root on ? type ? > > config netbsd-sd0a root on sd0a type ffs dumps none > > config netbsd-reth0 root on reth0 type nfs dumps none > > > > > > Martin > > I am to blame for testing only a very simple configuration! > > But it still seems wrong... > > Is it correct that ${SYSTEM_OBJ:N*swap*netbsd*} is only for > configurations named "netbsd*" - are other names allowed? > > Maybe we should just use ${SYSTEM_OBJ:N*swap*}, but I still > need to check whether any other object files match this pattern. The suggested pattern will not work. Perhaps I am the only person with a kernel not named "netbsd*"! I suggest reverting the change until we find a solution, with my apologies. -- Kind regards, Yorick Hardy
Re: Fix for kern/51772 breaks linking multi-config kernels?
Dear Martin, On 2017-01-04, Martin Husemann wrote: > On Wed, Jan 04, 2017 at 07:28:19PM +0200, Yorick Hardy wrote: > > Apologies, my "fix" broke your build. I wonder why it worked before, > > probably becuase you have "netbsd" as part of your kernel name? > > > > Maybe the change should be reverted until the correct solution is found. > > Just a side note: we have several evb* configs that use multiple config > statements, for example sys/arch/evbmips/conf/ZYXELKX: > > config netbsd root on ? type ? > config netbsd-sd0a root on sd0a type ffs dumps none > config netbsd-reth0 root on reth0 type nfs dumps none > > > Martin I am to blame for testing only a very simple configuration! But it still seems wrong... Is it correct that ${SYSTEM_OBJ:N*swap*netbsd*} is only for configurations named "netbsd*" - are other names allowed? Maybe we should just use ${SYSTEM_OBJ:N*swap*}, but I still need to check whether any other object files match this pattern. -- Kind regards, Yorick Hardy
Re: Fix for kern/51772 breaks linking multi-config kernels?
Dear John, On 2017-01-04, John D. Baker wrote: > Since this commit: > > http://mail-index.netbsd.org/source-changes/2017/01/04/msg080495.html > > My custom kernel with multiple "config" statments: > > include "arch/evbmips/conf/LOONGSON" > [...] > no config netbsd > config netbsd_nfs root on ? type nfs dumps on wd0j > config netbsd_sd0 root on sd0a type ffs dumps on wd0j > config netbsd_sd1 root on sd1a type ffs > [...] > > fails linking with: > > [...] > # link YEELOONG/netbsd_nfs > /d0/build/current/tools/amd64/bin/mips64el--netbsd-ld -Map netbsd_nfs.map > --cref -m elf64ltsmip -T netbsd_nfs.ldscript -Ttext 0x8020 -e > start -G 0 -X -o netbsd_nfs ${SYSTEM_OBJ:N*swap*netbsd_nfs*} ${EXTRA_OBJ} > vers.o swapnetbsd_nfs.o > swapnetbsd_sd1.o:(.data+0x0): multiple definition of `rootfstype' > swapnetbsd_sd0.o:(.data+0x0): first defined here > swapnetbsd_sd1.o:(.data+0x8): multiple definition of `dumpdev' > swapnetbsd_sd0.o:(.data+0x8): first defined here > swapnetbsd_sd1.o:(.data+0x10): multiple definition of `dumpspec' > swapnetbsd_sd0.o:(.data+0x10): first defined here > swapnetbsd_sd1.o:(.data+0x18): multiple definition of `rootdev' > swapnetbsd_sd0.o:(.data+0x18): first defined here > swapnetbsd_sd1.o:(.data+0x20): multiple definition of `rootspec' > swapnetbsd_sd0.o:(.data+0x20): first defined here > swapnetbsd_nfs.o:(.data+0x0): multiple definition of `rootfstype' > swapnetbsd_sd0.o:(.data+0x0): first defined here > swapnetbsd_nfs.o:(.data+0x8): multiple definition of `dumpdev' > swapnetbsd_sd0.o:(.data+0x8): first defined here > swapnetbsd_nfs.o:(.data+0x10): multiple definition of `dumpspec' > swapnetbsd_sd0.o:(.data+0x10): first defined here > swapnetbsd_nfs.o:(.data+0x18): multiple definition of `rootdev' > swapnetbsd_sd0.o:(.data+0x18): first defined here > swapnetbsd_nfs.o:(.data+0x20): multiple definition of `rootspec' > swapnetbsd_sd0.o:(.data+0x20): first defined here > /d0/build/current/tools/amd64/bin/mips64el--netbsd-ld: Warning: netbsd_nfs > uses -msoft-float (set by locore.o), mips_fpu.o uses -mhard-float > /d0/build/current/tools/amd64/bin/mips64el--netbsd-ld: Warning: netbsd_nfs > uses -msoft-float (set by locore.o), fp.o uses -mhard-float > *** [netbsd_nfs] Error code 1 > > nbmake: stopped in > /d0/build/current/obj/mips64el/sys/arch/evbmips/compile/YEELOONG > 1 error > > nbmake: stopped in > /d0/build/current/obj/mips64el/sys/arch/evbmips/compile/YEELOONG > > ERROR: Failed to make all in > "/d0/build/current/obj/mips64el/sys/arch/evbmips/compile/YEELOONG" > *** BUILD ABORTED *** > > > Perhaps this is an update-build issue? I'll wipe my ".../compile" > directory and try again. Apologies, my "fix" broke your build. I wonder why it worked before, probably becuase you have "netbsd" as part of your kernel name? Maybe the change should be reverted until the correct solution is found. -- Kind regards, Yorick Hardy