from:"Yorick Hardy"

amdgpu: Atom BIOS from ACPI

2024-01-05 Thread Yorick Hardy

Dear current-users,

In the patch below, I have implemented the few missing pieces for loading
the Atom BIOS from ACPI which is needed when booting via UEFI. Now the
BIOS loads for me (but amdgpu does not work yet).

I am trying to get amdgpu working on a Ryzen 7 5700G with (from lspci)

 Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon 
Vega Mobile Series] (rev c8)

I guess this will be supported if/when drm is next updated, but I have
patched amdgpu_drv.c below to start looking at it. The next issue is
"PSP load tmr failed" ...

-- 
Kind regards,

Yorick Hardy

Index: sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_drv.c
===
RCS file: /cvsroot/src/sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_drv.c,v
retrieving revision 1.8
diff -u -r1.8 amdgpu_drv.c
--- sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_drv.c  19 Dec 2021 
12:23:42 -  1.8
+++ sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_drv.c  5 Jan 2024 
07:32:12 -
@@ -1013,6 +1013,7 @@
 
/* Renoir */
{0x1002, 0x1636, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU},
+   {0x1002, 0x1638, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU},
 
/* Navi12 */
{0x1002, 0x7360, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 
CHIP_NAVI12|AMD_EXP_HW_SUPPORT},
Index: sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_bios.c
===
RCS file: /cvsroot/src/sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_bios.c,v
retrieving revision 1.6
diff -u -r1.6 amdgpu_bios.c
--- sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_bios.c 27 Feb 2022 
14:23:24 -  1.6
+++ sys/external/bsd/drm2/dist/drm/amd/amdgpu/amdgpu_bios.c 5 Jan 2024 
07:32:09 -
@@ -402,7 +402,7 @@
return amdgpu_asic_read_disabled_bios(adev);
 }
 
-#ifdef CONFIG_ACPI
+#if defined(CONFIG_ACPI) || (NACPICA > 0)
 static bool amdgpu_acpi_vfct_bios(struct amdgpu_device *adev)
 {
struct acpi_table_header *hdr;
@@ -412,7 +412,11 @@
 
if (!ACPI_SUCCESS(acpi_get_table("VFCT", 1, )))
return false;
+#ifdef __NetBSD__
+   tbl_size = hdr->Length;
+#else
tbl_size = hdr->length;
+#endif
if (tbl_size < sizeof(UEFI_ACPI_VFCT)) {
DRM_ERROR("ACPI VFCT table present but broken (too short 
#1)\n");
return false;
Index: sys/external/bsd/drm2/include/linux/acpi.h
===
RCS file: /cvsroot/src/sys/external/bsd/drm2/include/linux/acpi.h,v
retrieving revision 1.10
diff -u -r1.10 acpi.h
--- sys/external/bsd/drm2/include/linux/acpi.h  28 May 2022 01:07:47 -  
1.10
+++ sys/external/bsd/drm2/include/linux/acpi.h  5 Jan 2024 07:50:25 -
@@ -56,6 +56,7 @@
 union acpi_object *acpi_evaluate_dsm_typed(acpi_handle, const guid_t *,
 uint64_t, uint64_t, union acpi_object *, acpi_object_type);
 bool acpi_check_dsm(acpi_handle, const guid_t *, uint64_t, uint64_t);
+u32 acpi_get_table(const char*, u32, struct acpi_table_header **);
 
 #endif /* NACPICA > 0 */
 #endif  /* _LINUX_ACPI_H_ */
Index: sys/external/bsd/drm2/linux/linux_acpi.c
===
RCS file: /cvsroot/src/sys/external/bsd/drm2/linux/linux_acpi.c,v
retrieving revision 1.2
diff -u -r1.2 linux_acpi.c
--- sys/external/bsd/drm2/linux/linux_acpi.c28 Feb 2022 17:15:30 -  
1.2
+++ sys/external/bsd/drm2/linux/linux_acpi.c5 Jan 2024 07:50:29 -
@@ -120,3 +120,9 @@
return true;
return false;
 }
+
+u32
+acpi_get_table(const char* signature, u32 instance, struct acpi_table_header 
**hdr)
+{
+   return AcpiGetTable(signature, instance, hdr);
+}

Re: Serious crashes on 9.99.93

2021-12-30 Thread Yorick Hardy

Dear current-users,

On 2021-12-28, pin wrote:
> ‐‐‐ Original Message ‐‐‐
> 
> Unfortunately, I had to wipe my disk due to a corrupted file system.
> I did a fresh install using the 9.99.93 image from today 28-Dec-2021 08:57
> 
> The same crash happened the first time I launched the web browser.
> This time I got a stack backtrace, as follows:
> 
> 
> pin@mybox # pwd
> /var/crash
> pin@mybox # ls -l
> total 797032
> -rw---  1 root  wheel  2 Dec 28 20:41 bounds
> -rw---  1 root  wheel  5 Dec 28 00:19 minfree
> -rw---  1 root  wheel2332254 Dec 28 20:41 netbsd.0
> -rw---  1 root  wheel  405485080 Dec 28 20:41 netbsd.0.core
> pin@mybox # gdb --eval-command="file /netbsd.gdb"
> GNU gdb (GDB) 11.0.50.20200914-git
> Copyright (C) 2020 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> Type "show copying" and "show warranty" for details.
> This GDB was configured as "x86_64--netbsd".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <https://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
> 
> For help, type "help".
> Type "apropos word" to search for commands related to "word".
> Reading symbols from /netbsd.gdb...
> (gdb) target kvm netbsd.0.core
> 0x802261f5 in cpu_reboot (howto=howto@entry=260, 
> bootstr=bootstr@entry=0x0) at /usr/src/sys/arch/amd64/amd64/machdep.c:720
> 720 /usr/src/sys/arch/amd64/amd64/machdep.c: No such file or directory.
> (gdb) bt
> #0  0x802261f5 in cpu_reboot (howto=howto@entry=260, 
> bootstr=bootstr@entry=0x0) at /usr/src/sys/arch/amd64/amd64/machdep.c:720
> #1  0x80dbcd54 in kern_reboot (howto=howto@entry=260, 
> bootstr=bootstr@entry=0x0) at /usr/src/sys/kern/kern_reboot.c:73
> #2  0x80dffcf2 in vpanic (fmt=fmt@entry=0x81390370 "trap", 
> ap=ap@entry=0x8580b77ebab8) at /usr/src/sys/kern/subr_prf.c:290
> #3  0x80dffdb7 in panic (fmt=fmt@entry=0x81390370 "trap") at 
> /usr/src/sys/kern/subr_prf.c:209
> #4  0x80229017 in trap (frame=0x8580b77ebc00) at 
> /usr/src/sys/arch/amd64/amd64/trap.c:326
> #5  0x802210e3 in alltraps ()
> #6  0xff404040ff404040 in ?? ()
> #7  0x0019 in ?? ()
> #8  0x80f752dea000 in ?? ()
> #9  0x in ?? ()
> (gdb)
> 
> 
> 
> This did not happen before on 9.99.92 up to the 15th of December.
> Hope that this is useful.
> 
> Best Regards

I tried to reproduce the problems that pin is seeing. I seldom got
as far as the web browser, but did get the following backtraces:

  Crash version 9.99.42, image version 9.99.93.
  WARNING: versions differ, you may not be able to examine this image.
  System panicked: pr_phinpage_check: [i915_request] item 0x939b82a4f840 
not part of pool
  Backtrace from time of crash is available.
  _KERNEL_OPT_GENFB_GLYPHCACHE() at 0
  _KERNEL_OPT_GENFB_GLYPHCACHE() at 0
  sys_reboot() at sys_reboot
  vpanic() at vpanic+0x160
  device_printf() at device_printf
  pool_cache_put_paddr() at pool_cache_put_paddr+0x14f
  linux_dma_resv_fini() at linux_dma_resv_fini+0x55
  __i915_gem_free_object_rcu() at __i915_gem_free_object_rcu+0x45
  gc_thread() at gc_thread+0x92

  Crash version 9.99.42, image version 9.99.93.
  WARNING: versions differ, you may not be able to examine this image.
  System panicked: trap
  Backtrace from time of crash is available.
  _KERNEL_OPT_GENFB_GLYPHCACHE() at 0
  ?() at bf014f2a
  sys_reboot() at sys_reboot
  vpanic() at vpanic+0x160
  device_printf() at device_printf
  startlwp() at startlwp
  calltrap() at calltrap+0x19
  ffs_sync() at ffs_sync+0x75
  VFS_SYNC() at VFS_SYNC+0x22
  sched_sync() at sched_sync+0x90

Could the first backtrace be a clue? Changes were made in this area after the 
15th,
which is when pin still had a stable kernel.

-- 
Kind regards,

Yorick Hardy

Re: Serious crashes on 9.99.93

2021-12-28 Thread Yorick Hardy

Dear pin,

On 2021-12-28, pin wrote:
> 
> 
> Sent with ProtonMail Secure Email.
> 
> ‐‐‐ Original Message ‐‐‐
> 
> On Tuesday, December 28th, 2021 at 16:29, Yorick Hardy 
>  wrote:
> 
> > Dear pin,
> >
> > On 2021-12-27, pin wrote:
> >
> > > Hi all,
> > >
> > > I've upgraded my amd64 machine to NetBSD-9.99.93 yesterday and I'm 
> > > experience serious crashes which were not happening on 9.99.92.
> > >
> > > dmesg, https://pastebin.com/8WJeUJDj
> > >
> > > Xorg-log, https://pastebin.com/xTAmUZPU
> > >
> > > The backtraces from the coredumps aren't really useful, 
> > > https://pastebin.com/eaXYEC0Z
> > >
> > > I've managed to reproduce the crashes by launching lariza or badwolf web 
> > > browsers.
> > >
> > > The system runs without issues if I don't use a web browser.
> > >
> > > Also, I've noticed the following while booting after a crash
> > >
> > > panic: kernel diagnostic assertion "solocked2(so, so2)" failed: file 
> > > "/usr/src/sys/kern/uipc_usrreq.c", line 525
> > >
> > > Finally, unsure if related, console resolution doesn't scale after 
> > > loading i915drmkms0, it used to in 9.99.92.
> > >
> > > Although, resolution after startx is correct.
> > >
> > > I've hosted the core-dumps in case,
> > >
> > > netbsd.2.core.gz, https://ufile.io/d00lfx4f
> > >
> > > netbsd.2.gz, https://ufile.io/4yvklq5w
> > >
> > > Thank you for any hints.
> > >
> > > Best,
> > >
> > > pin
> > >
> > > Sent with ProtonMail Secure Email.
> >
> > Somehow I managed to get a backtrace (it seems to be correct):
> >
> > Crash version 9.99.82, image version 9.99.93.
> >
> > WARNING: versions differ, you may not be able to examine this image.
> >
> > crash: _kvm_kvatop(0)
> >
> > Kernel compiled without options LOCKDEBUG.
> >
> > System panicked: kernel diagnostic assertion "solocked2(so, so2)" failed: 
> > file "/usr/src/sys/kern/uipc_usrreq.c", line 525
> >
> > Backtrace from time of crash is available.
> >
> > crash> bt
> >
> > _KERNEL_OPT_GENFB_GLYPHCACHE() at 0
> >
> > _KERNEL_OPT_GENFB_GLYPHCACHE() at 0
> >
> > sys_reboot() at sys_reboot
> >
> > vpanic() at vpanic+0x160
> >
> > __x86_indirect_thunk_rax() at __x86_indirect_thunk_rax
> >
> > unp_send() at unp_send+0xa15
> >
> > sosend() at sosend+0x845
> >
> > soo_write() at soo_write+0x2f
> >
> > do_filewritev.part.0() at do_filewritev.part.0+0x25d
> >
> > syscall() at syscall+0x196
> >
> > --- syscall (number 121) ---
> >
> > syscall+0x196:
> >
> > crash>
> >
> > the last change I could find which might be relevant was in 
> > sys/kern/sys_generic.c 1.133, but that
> >
> > was before 9.99.93, so I am not sure where to look.
> >
> > --
> >
> > Kind regards,
> >
> > Yorick Hardy
> 
> Thanks!
> 
> Was this change made in 9.99.92 after the 15th of December?
> 
> Until the above date this did not happen.
> 
> Regards

11 September, so: long before! But thanks, I will continue to look
for any changes after 15 December (I am sure someone more knowledgeable
will find it, but I will have a go in the mean time).

-- 
Kind regards,

Yorick Hardy

Re: Serious crashes on 9.99.93

2021-12-28 Thread Yorick Hardy

Dear pin,

On 2021-12-27, pin wrote:
> Hi all,
> I've upgraded my amd64 machine to NetBSD-9.99.93 yesterday and I'm experience 
> serious crashes which were not happening on 9.99.92.
> dmesg, https://pastebin.com/8WJeUJDj
> Xorg-log, https://pastebin.com/xTAmUZPU
> 
> The backtraces from the coredumps aren't really useful, 
> https://pastebin.com/eaXYEC0Z
> 
> I've managed to reproduce the crashes by launching lariza or badwolf web 
> browsers.
> The system runs without issues if I don't use a web browser.
> Also, I've noticed the following while booting after a crash
> 
> panic: kernel diagnostic assertion "solocked2(so, so2)" failed: file 
> "/usr/src/sys/kern/uipc_usrreq.c", line 525
> 
> Finally, unsure if related, console resolution doesn't scale after loading 
> i915drmkms0, it used to in 9.99.92.
> Although, resolution after startx is correct.
> 
> I've hosted the core-dumps in case,
> netbsd.2.core.gz, https://ufile.io/d00lfx4f
> netbsd.2.gz, https://ufile.io/4yvklq5w
> 
> Thank you for any hints.
> Best,
> pin
> 
> Sent with ProtonMail Secure Email.

Somehow I managed to get a backtrace (it seems to be correct):

  Crash version 9.99.82, image version 9.99.93.
  WARNING: versions differ, you may not be able to examine this image.
  crash: _kvm_kvatop(0)
  Kernel compiled without options LOCKDEBUG.
  System panicked: kernel diagnostic assertion "solocked2(so, so2)" failed: 
file "/usr/src/sys/kern/uipc_usrreq.c", line 525
  Backtrace from time of crash is available.
  crash> bt
  _KERNEL_OPT_GENFB_GLYPHCACHE() at 0
  _KERNEL_OPT_GENFB_GLYPHCACHE() at 0
  sys_reboot() at sys_reboot
  vpanic() at vpanic+0x160
  __x86_indirect_thunk_rax() at __x86_indirect_thunk_rax
  unp_send() at unp_send+0xa15
  sosend() at sosend+0x845
  soo_write() at soo_write+0x2f
  do_filewritev.part.0() at do_filewritev.part.0+0x25d
  syscall() at syscall+0x196
  --- syscall (number 121) ---
  syscall+0x196:
  crash>

the last change I could find which might be relevant was in 
sys/kern/sys_generic.c 1.133, but that
was before 9.99.93, so I am not sure where to look.

-- 
Kind regards,

Yorick Hardy

Re: Panic in usbd_create_xfer

2021-01-22 Thread Yorick Hardy

On 2021-01-03, Yorick Hardy wrote:
> Dear matthew,
> 
> On 2021-01-03, matthew green wrote:
> > Yorick Hardy writes:
> > > Dear current-users,
> > >
> > > Happy new year!
> > 
> > happy new year yorick! and everyone.
> > 
> > > [   659.839003] usbd_create_xfer() at netbsd:usbd_create_xfer+0x186
> > > [   659.849001] usbd_open_pipe_intr() at netbsd:usbd_open_pipe_intr+0x74
> > > [   659.849001] uhidev_open() at netbsd:uhidev_open+0x21c
> > 
> > can you find out what lines in the source these are? 
> > espcially usbd_create_xfer+0x186, the other ones are
> > most likely obvious only the single callers - eg,
> > usbd_open_pipe_intr() calls usbd_create_xfer() once.
> > 
> > thanks.
> > 
> > 
> > .mrg.
> 
> In the disassembly (I guess due to inlining) it happens at a call to
> usb_allocmem, which I think is line sys/dev/usb/usbdi.c:606, i.e. the
> call to usbd_alloc_buffer.
> 
> I am current trying to trigger the panic with all of the USB_DEBUG and
> {U,O,E}HCI_DEBUG options enabled but it has not happened yet (I was sure
> I would be able to panic the kernel by now, maybe the _DEBUG options
> worka round the panic somehow?).

Reverting to src/sys/dev/usb/ohci.c revision 1.310 seems to
solve the panic, I have not yet determined how this change
could lead to a panic.

-- 
Kind regards,

Yorick Hardy

Re: Panic in usbd_create_xfer

2021-01-03 Thread Yorick Hardy

Dear matthew,

On 2021-01-03, matthew green wrote:
> Yorick Hardy writes:
> > Dear current-users,
> >
> > Happy new year!
> 
> happy new year yorick! and everyone.
> 
> > [   659.839003] usbd_create_xfer() at netbsd:usbd_create_xfer+0x186
> > [   659.849001] usbd_open_pipe_intr() at netbsd:usbd_open_pipe_intr+0x74
> > [   659.849001] uhidev_open() at netbsd:uhidev_open+0x21c
> 
> can you find out what lines in the source these are? 
> espcially usbd_create_xfer+0x186, the other ones are
> most likely obvious only the single callers - eg,
> usbd_open_pipe_intr() calls usbd_create_xfer() once.
> 
> thanks.
> 
> 
> .mrg.

In the disassembly (I guess due to inlining) it happens at a call to
usb_allocmem, which I think is line sys/dev/usb/usbdi.c:606, i.e. the
call to usbd_alloc_buffer.

I am current trying to trigger the panic with all of the USB_DEBUG and
{U,O,E}HCI_DEBUG options enabled but it has not happened yet (I was sure
I would be able to panic the kernel by now, maybe the _DEBUG options
worka round the panic somehow?).

-- 
Kind regards,

Yorick Hardy

Re: Panic in usbd_create_xfer

2020-12-31 Thread Yorick Hardy

Dear current-users,

Happy new year!

On 2020-12-30, Yorick Hardy wrote:
> Dear current-users,
> 
> Is anyone else seeing panics when opening a uhidev? (Generally from SDL.)
> I am using a custom kernel with a uintuos (not commited) Wacom, ukbd
> and ums hid devices. The panic only happens after a few days, the host
> controller is ehci (ATI SB700).
> 
> Some example panics:

I had forgotten to include DIAGNOSTIC, now the panic shows a warning
from ohci:

[   659.829010] uvm_fault(0xb2272ae9f480, 0x0, 1) -> e
[   659.829010] fatal page fault in supervisor mode
[   659.829010] trap type 6 code 0 rip 0x80357ce6 cs 0x8 rflags 
0x210206 cr2 0 ilevel 0 rsp 0xce014cdbea30
[   659.829010] curlwp 0xb2272638b9c0 pid 1880.1880 lowest kstack 
0xce014cdba2c0
[   659.829010] panic: trap
[   659.829010] cpu0: Begin traceback...
[   659.829010] ohci1: WARNING: addr 0x40054bc0 not found
[   659.829010] vpanic() at netbsd:vpanic+0x156
[   659.839003] snprintf() at netbsd:snprintf
[   659.839003] startlwp() at netbsd:startlwp
[   659.839003] alltraps() at netbsd:alltraps+0xbb
[   659.839003] usbd_create_xfer() at netbsd:usbd_create_xfer+0x186
[   659.849001] usbd_open_pipe_intr() at netbsd:usbd_open_pipe_intr+0x74
[   659.849001] uhidev_open() at netbsd:uhidev_open+0x21c
[   659.849001] uhidopen() at netbsd:uhidopen+0xf9
[   659.859002] cdev_open() at netbsd:cdev_open+0xae
[   659.859002] spec_open() at netbsd:spec_open+0x176
[   659.869002] VOP_OPEN() at netbsd:VOP_OPEN+0x3c
[   659.869002] vn_open() at netbsd:vn_open+0x130
[   659.869002] do_open() at netbsd:do_open+0x119
[   659.879002] do_sys_openat() at netbsd:do_sys_openat+0x74
[   659.879002] sys_open() at netbsd:sys_open+0x24
[   659.889001] syscall() at netbsd:syscall+0x1cc
[   659.889001] --- syscall (number 5) ---
[   659.889001] netbsd:syscall+0x1cc:
[   659.889001] cpu0: End traceback...

-- 
Kind regards,

Yorick Hardy

Panic in usbd_create_xfer

2020-12-30 Thread Yorick Hardy

Dear current-users,

Is anyone else seeing panics when opening a uhidev? (Generally from SDL.)
I am using a custom kernel with a uintuos (not commited) Wacom, ukbd
and ums hid devices. The panic only happens after a few days, the host
controller is ehci (ATI SB700).

Some example panics:

[ 86314.177679] uvm_fault(0x9eda7c67d6f8, 0x81, 1) -> e
[ 86314.177679] fatal page fault in supervisor mode
[ 86314.177679] trap type 6 code 0 rip 0x8035784f cs 0x8 rflags 0x10286 
cr2 0x810702 ilevel 0 rsp 0x9f8156ab2a40
[ 86314.177679] curlwp 0x9edaa1c52900 pid 786.786 lowest kstack 
0x9f8156aae2c0
[ 86314.177679] panic: trap
[ 86314.177679] cpu1: Begin traceback...
[ 86314.177679] vpanic() at netbsd:vpanic+0x156
[ 86314.187672] snprintf() at netbsd:snprintf
[ 86314.187672] startlwp() at netbsd:startlwp
[ 86314.187672] alltraps() at netbsd:alltraps+0xbb
[ 86314.187672] usbd_create_xfer() at netbsd:usbd_create_xfer+0x186
[ 86314.197668] usbd_open_pipe_intr() at netbsd:usbd_open_pipe_intr+0x74
[ 86314.197668] uhidev_open() at netbsd:uhidev_open+0x18c
[ 86314.207668] uhidopen() at netbsd:uhidopen+0xe1
[ 86314.207668] cdev_open() at netbsd:cdev_open+0xae
[ 86314.207668] spec_open() at netbsd:spec_open+0x176
[ 86314.217669] VOP_OPEN() at netbsd:VOP_OPEN+0x3c
[ 86314.217669] vn_open() at netbsd:vn_open+0x130
[ 86314.217669] do_open() at netbsd:do_open+0x119
[ 86314.227669] do_sys_openat() at netbsd:do_sys_openat+0x74
[ 86314.227669] sys_open() at netbsd:sys_open+0x24
[ 86314.237670] syscall() at netbsd:syscall+0x1cc
[ 86314.237670] --- syscall (number 5) ---
[ 86314.237670] netbsd:syscall+0x1cc:
[ 86314.237670] cpu1: End traceback...

Crash version 9.99.77, image version 9.99.77.
crash: _kvm_kvatop(0)
Kernel compiled without options LOCKDEBUG.
System panicked: trap
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NAGR() at 0
?() at bb0150547000
sys_reboot() at sys_reboot
vpanic() at vpanic+0x160
snprintf() at snprintf
startlwp() at startlwp
calltrap() at calltrap+0x11
usbd_create_xfer() at usbd_create_xfer+0x186
usbd_open_pipe_intr() at usbd_open_pipe_intr+0x74
uhidev_open() at uhidev_open+0x18c
uhidopen() at uhidopen+0xe1
cdev_open() at cdev_open+0xae
spec_open() at spec_open+0x176
VOP_OPEN() at VOP_OPEN+0x3c
vn_open() at vn_open+0x130
do_open() at do_open+0x119
do_sys_openat() at do_sys_openat+0x74
sys_open() at sys_open+0x24
syscall() at syscall+0x1cc
--- syscall (number 5) ---
syscall+0x1cc:
crash>

Crash version 9.99.77, image version 9.99.77.
crash: _kvm_kvatop(0)
Kernel compiled without options LOCKDEBUG.
System panicked: trap
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NAGR() at 0
?() at b7814c8e
sys_reboot() at sys_reboot
vpanic() at vpanic+0x160
snprintf() at snprintf
startlwp() at startlwp
calltrap() at calltrap+0x11
usbd_create_xfer() at usbd_create_xfer+0x186
usbd_open_pipe_intr() at usbd_open_pipe_intr+0x74
uhidev_open() at uhidev_open+0x18c
uhidopen() at uhidopen+0xe1
cdev_open() at cdev_open+0xae
spec_open() at spec_open+0x176
VOP_OPEN() at VOP_OPEN+0x3c
vn_open() at vn_open+0x130
do_open() at do_open+0x119
do_sys_openat() at do_sys_openat+0x74
sys_open() at sys_open+0x24
syscall() at syscall+0x1cc
--- syscall (number 5) ---
syscall+0x1cc:

-- 
Kind regards,

Yorick Hardy

Re: build-success/install-fault on i486 with xsrc

2020-12-16 Thread Yorick Hardy

Dear Maya,

On 2020-11-01, m...@netbsd.org wrote:
> On Sun, Nov 01, 2020 at 04:44:41PM +0100, Lizbeth Mutterhunt, Ph.D wrote:
> > so!
> > 
> > sucess in the built! had a hung-up at the latest test:
> > 
> > _nv_cas.d.tmp atomic_or_64_nv_cas.d
> > : fatal error: when writing output to : No space left on device
> > 
> > but /tmp was empty and there were 4,5GB free. Damn old drag, did this
> > business! now gonna to do some emulation for rooting mobile.
> > 
> > thx and good luck in BSDing,
> 
> if you update src, it will re-enable GLX_USE_TLS and should just work on
> i386, no changes necessary.

Apologies for resurrecting an old thread, I am not sure it is entirely related.
I have not been able to use libGL on i386 for quite some time, the following
core dump might provide some information (the back trace is unusable/empty).
This is on current updated on 15 December. Unfortunately, I still do not know
the cause of the SIGSEGV.

Does glxgears work for anyone else (on i386+i915)?

-- 
Kind regards,

Yorick Hardy

$ gdb /usr/X11R7/bin/glxgears  /tmp/yorick.glxgears.core
GNU gdb (GDB) 11.0.50.20200914-git
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "i486--netbsdelf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/X11R7/bin/glxgears...
(No debugging symbols found in /usr/X11R7/bin/glxgears)
[New process 20022]
Core was generated by `glxgears'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0xba370252 in _mesa_x86_cliptest_points4 () from 
/usr/X11R7/lib/modules/dri/i915_dri.so
(gdb) disassemble
Dump of assembler code for function _mesa_x86_cliptest_points4:
   0xba3701e0 <+0>:   push   %esi
   0xba3701e1 <+1>:   push   %edi
   0xba3701e2 <+2>:   push   %ebp
   0xba3701e3 <+3>:   push   %ebx
   0xba3701e4 <+4>:   call   0xba3701f8 <_mesa_x86_cliptest_points4+24>
   0xba3701e9 <+9>:   add$0x348e17,%ebx
   0xba3701ef <+15>:  lea0x30fe0(%ebx),%ebx
   0xba3701f5 <+21>:  push   %ebx
   0xba3701f6 <+22>:  jmp0xba3701fc <_mesa_x86_cliptest_points4+28>
   0xba3701f8 <+24>:  mov(%esp),%ebx
   0xba3701fb <+27>:  ret
   0xba3701fc <+28>:  mov0x18(%esp),%esi
   0xba370200 <+32>:  mov0x1c(%esp),%edi
   0xba370204 <+36>:  mov0x20(%esp),%edx
   0xba370208 <+40>:  mov0x24(%esp),%ebx
   0xba37020c <+44>:  mov0x28(%esp),%ebp
   0xba370210 <+48>:  mov0x14(%esi),%eax
   0xba370213 <+51>:  mov0x10(%esi),%ecx
   0xba370216 <+54>:  mov0x8(%esi),%esi
   0xba370219 <+57>:  orl$0xf,0x1c(%edi)
   0xba37021d <+61>:  mov%eax,0x18(%esp)
   0xba370221 <+65>:  movl   $0x4,0x18(%edi)
   0xba370228 <+72>:  mov%ecx,0x10(%edi)
   0xba37022b <+75>:  mov0x8(%edi),%edi
   0xba37022e <+78>:  add%edx,%ecx
   0xba370230 <+80>:  mov%ecx,0x20(%esp)
   0xba370234 <+84>:  cmp%ecx,%edx
   0xba370236 <+86>:  mov(%ebx),%al
   0xba370238 <+88>:  mov0x0(%ebp),%ah
   0xba37023b <+91>:  je 0xba3702e7 <_mesa_x86_cliptest_points4+263>
   0xba370241 <+97>:  lea0x0(%esi,%eiz,1),%esi
   0xba370248 <+104>: lea0x0(%esi,%eiz,1),%esi
   0xba37024f <+111>: nop
   0xba370250 <+112>: fld1
=> 0xba370252 <+114>: fdivs  0xc(%esi)
   0xba370255 <+117>: mov0xc(%esi),%ebp
   0xba370258 <+120>: mov0x8(%esi),%ebx
   0xba37025b <+123>: xor%ecx,%ecx
--Type  for more, q to quit, c to continue without paging--

Re: uvm_map_enter entry merging (was Re: vrelel...)

2020-11-29 Thread Yorick Hardy

Dear Chuck,

On 2020-11-29, Chuck Silvers wrote:
> hi Yorick,
> 
> On Sat, Nov 28, 2020 at 12:39:56AM +0200, Yorick Hardy wrote:
> > May I ask if you have an opinion on this patch? I have
> > not noticed any bad behaviour if it is omitted but, if I read
> > the code correctly, I don't think it is correct to fall through
> > for this case.
> 
> this function is very hard to follow, it's very tangled.
> I stared at it for a while and I didn't see anything wrong,
> but it's hard to be sure just from reading the code.

I am sorry to have made more work for you, and thanks for looking at
it!  I agree, and since I have only a superficial understanding of
how everything fits together I am not 100% sure that my reading is
correct.

> could you explain the specific case that you think is wrong now and
> that your patch fixes?

I will look at the tests as suggested below. Please read/ignore this
part as you see fit. Superficially (and perhaps that is part of the
"not reading correctly"!)

1) if we reach 

http://nxr.netbsd.org/xref/src/sys/uvm/uvm_map.c#1479

   then one of two things happened (with merged == 1):

a) an amap was extended
b) neither prev_entry nor prev_entry->next have an amap

2) now we reach

https://nxr.netbsd.org/xref/src/sys/uvm/uvm_map.c#1493

   and since merged == 1 in (1)

https://nxr.netbsd.org/xref/src/sys/uvm/uvm_map.c#1497

UVMMAP_EVCNT_DECR(ubackmerge);
UVMMAP_EVCNT_INCR(ubimerge);

  but in case 1(b) the forward merge never happened?

That was the starting point of my query. I thought perhaps that
1(b) never happens, but a printf shows that it happens often.

I am "living dangerously" and running with the patch for now,
although I am not sure yet where to look for any adverse effects.
 
> even better would be if you could write a set of atf tests to exercise
> all of the possible merge cases and verify that the contents of memory
> after the new mapping is created is what it should be.
> any previous and next mapping should have the same contents as before,
> and the new mapping should have either zeroes (for a new amap mapping)
> or the uobj contents at that offset (for a new uobj mapping).

I did not think I could do that, I will look into it! (Energy is low
again so, when I get to it ...)

> note that a vm_map_entry can reference both a uobj and an amap at the
> same time, so there are 4 possible cases for the each of previous and next
> entries (none, uobj, amap, uobj+amap), and two possible cases for the
> new entry (uobj, amap).  then I guess there are two more factors of 2
> for whether the forward and/or backward merges succeed, so that gives
> at least 128 cases to test.  I think there are some more cases hidden
> in there because there are multiple reasons why the merges might fail
> and those checks are in different places, so it would really be best
> to test all of the different possible paths through this function.
> 
> I would be reluctant to change anything here without such a set of
> comprehensive tests, because even if we are sure that a change fixes
> one case, it would be very hard to be sure that it doesn't break
> some other case.
> 
> -Chuck

Understood, it is quite possible that I am way out of my depth!

-- 
Kind regards,

Yorick Hardy

Re: Panic: vrelel: bad ref count (9.99.54)

2020-11-27 Thread Yorick Hardy

Dear Chuck,

On 2020-11-27, Chuck Silvers wrote:
> Hi Yorick,
> 
> On Fri, Nov 27, 2020 at 06:29:07PM +0200, Yorick Hardy wrote:
> > 
> > I think that uvm_mremap did not keep pace with changes in uvm.
> > This patch seems to fix it for me, although I have only tested
> > for two days so far (I am usually able to trigger the panic by
> > now ... but lets see).
> 
> Your patch looks good, please go ahead and commit it.
> 
> -Chuck

Thanks! May I ask if you have an opinion on this patch? I have
not noticed any bad behaviour if it is omitted but, if I read
the code correctly, I don't think it is correct to fall through
for this case.

-- 
Kind regards,

Yorick Hardy

Index: uvm_map.c
===
RCS file: /cvsroot/src/sys/uvm/uvm_map.c,v
retrieving revision 1.385
diff -u -r1.385 uvm_map.c
--- uvm_map.c   9 Jul 2020 05:57:15 -   1.385
+++ uvm_map.c   19 Nov 2020 16:04:07 -
@@ -1477,6 +1477,13 @@
amapwaitflag | AMAP_EXTEND_BACKWARDS))
goto nomerge;
}
+
+   /*
+* We could not extend either amap, just skip on.
+*/
+   else {
+   goto nomerge;
+   }
} else {
/*
 * Pull the next entry's amap backwards to cover this

Re: Panic: vrelel: bad ref count (9.99.54)

2020-11-27 Thread Yorick Hardy

Dear Andrew and Leonardo,

On 2020-11-19, Yorick Hardy wrote:
> Dear Andrew,
> 
> On 2020-05-05, Andrew Doran wrote:
> > On Mon, May 04, 2020 at 03:54:57PM +0200, Leonardo Taccari wrote:
> > > Hello Yorick and Andrew,
> > > 
> > > Yorick Hardy writes:
> > > > > > > [...]
> > > > > > > 
> > > > > > >   Crash version 9.99.55, image version 9.99.55.
> > > > > > >   crash: _kvm_kvatop(0)
> > > > > > >   Kernel compiled without options LOCKDEBUG.
> > > > > > >   System panicked: vrelel: bad ref count
> > > > > > >   Backtrace from time of crash is available.
> > > > > > >   crash> bt
> > > > > > >   _KERNEL_OPT_NAGR() at 0
> > > > > > >   ?() at 7f7ff7ecf000
> > > > > > >   sys_reboot() at sys_reboot
> > > > > > >   vpanic() at vpanic+0x181
> > > > > > >   vtryrele() at vtryrele
> > > > > > >   vcache_dealloc() at vcache_dealloc
> > > > > > >   uvm_unmap_detach() at uvm_unmap_detach+0x76
> > > > > > >   uvm_unmap1() at uvm_unmap1+0x4e
> > > > > > >   uvm_mremap() at uvm_mremap+0x36b
> > > > > > >   sys_mremap() at sys_mremap+0x68
> > > > > > >   syscall() at syscall+0x227
> > > > > > >   --- syscall (number 411) ---
> > > > > > >   797459842e9a:
> > > > > > >   crash>

[ rest of thread omitted ]

I think that uvm_mremap did not keep pace with changes in uvm.
This patch seems to fix it for me, although I have only tested
for two days so far (I am usually able to trigger the panic by
now ... but lets see).

Leonardo, would you be willing to try the patch?

-- 
Kind regards,

Yorick Hardy

Index: sys/uvm/uvm_mremap.c
===
RCS file: /cvsroot/src/sys/uvm/uvm_mremap.c,v
retrieving revision 1.20
diff -u -r1.20 uvm_mremap.c
--- sys/uvm/uvm_mremap.c23 Feb 2020 15:46:43 -  1.20
+++ sys/uvm/uvm_mremap.c26 Nov 2020 19:14:06 -
@@ -80,10 +80,8 @@
error = E2BIG; /* XXX */
goto done;
}
-   rw_enter(uobj->vmobjlock, RW_WRITER);
-   KASSERT(uobj->uo_refs > 0);
-   atomic_inc_uint(>uo_refs);
-   rw_exit(uobj->vmobjlock);
+   if (uobj->pgops->pgo_reference)
+   uobj->pgops->pgo_reference(uobj);
reserved_entry->object.uvm_obj = uobj;
reserved_entry->offset = newoffset;
}

Re: Panic: vrelel: bad ref count (9.99.54)

2020-11-19 Thread Yorick Hardy

Dear Andrew,

On 2020-05-05, Andrew Doran wrote:
> On Mon, May 04, 2020 at 03:54:57PM +0200, Leonardo Taccari wrote:
> > Hello Yorick and Andrew,
> > 
> > Yorick Hardy writes:
> > > > > > [...]
> > > > > > 
> > > > > >   Crash version 9.99.55, image version 9.99.55.
> > > > > >   crash: _kvm_kvatop(0)
> > > > > >   Kernel compiled without options LOCKDEBUG.
> > > > > >   System panicked: vrelel: bad ref count
> > > > > >   Backtrace from time of crash is available.
> > > > > >   crash> bt
> > > > > >   _KERNEL_OPT_NAGR() at 0
> > > > > >   ?() at 7f7ff7ecf000
> > > > > >   sys_reboot() at sys_reboot
> > > > > >   vpanic() at vpanic+0x181
> > > > > >   vtryrele() at vtryrele
> > > > > >   vcache_dealloc() at vcache_dealloc
> > > > > >   uvm_unmap_detach() at uvm_unmap_detach+0x76
> > > > > >   uvm_unmap1() at uvm_unmap1+0x4e
> > > > > >   uvm_mremap() at uvm_mremap+0x36b
> > > > > >   sys_mremap() at sys_mremap+0x68
> > > > > >   syscall() at syscall+0x227
> > > > > >   --- syscall (number 411) ---
> > > > > >   797459842e9a:
> > > > > >   crash>
> > > > > 
> > > > > The same just happened on 9.99.56 while fetching (POP) mail using 
> > > > > mail/fdm.
> > > > 
> > > > Could you file a PR please?  If this panics again could you please run 
> > > > the
> > > > "dmesg" command in crash and find out what it printed about the vnode?  
> > > > That
> > > > would be very useful.
> > > > 
> > > > Thanks,
> > > > Andrew
> > >
> > > I will do so (... perhaps only this weekend).
> > > [...]
> > 
> > I was able to reproduce it too with a yesterday evening NetBSD/amd64
> > -current when using mail/fdm and I will try to prepare a minimal
> > reproducer using mail/fdm and file a PR if noone beat me.
> > 
> > In the meantime here the information from dmesg:
> > 
> > [ 6107.6380323] vnode 0xa95219747d40 flags 0x418
> > [ 6107.6380323]tag VT_TMPFS(25) type VREG(1) mount 0xa951f6d89000 
> > typedata 0xa95255e32c90
> > [ 6107.6380323]usecount 1 writecount 1 holdcount 0
> > [ 6107.6380323]size 18000 writesize 18000 numoutput 0
> > [ 6107.6380323]data 0xa952583304a0 lock 0xa95219747f00
> > [ 6107.6380323]state LOADED key(0xa951f6d89000 8) a0 04 33 58 52 a9 
> > ff ff
> > [ 6107.6380323]lrulisthd 0x816b5ed0
> > [ 6107.6380323]tag VT_TMPFS, tmpfs_node 0xa952583304a0, flags 0x0, 
> > links 1
> > [ 6107.6380323]mode 0600, owner 1000, group 0, size 98304
> > [ 6107.6380323] panic: vrelel: bad ref count
> > [ 6107.6380323] cpu0: Begin traceback...
> > [ 6107.6380323] vpanic() at netbsd:vpanic+0x178
> > [ 6107.6480364] vnpanic() at netbsd:vnpanic+0x49
> > [ 6107.6480364] vrelel() at netbsd:vrelel+0x5b6
> > [ 6107.6480364] uvm_unmap_detach() at netbsd:uvm_unmap_detach+0x8e
> > [ 6107.6480364] sys_munmap() at netbsd:sys_munmap+0x85
> > [ 6107.6480364] syscall() at netbsd:syscall+0x2a0
> > [ 6107.6480364] --- syscall (number 73) ---
> > [ 6107.6480364] 7c1e5d18414a:
> > [ 6107.6480364] cpu0: End traceback...
> > [ 6107.6480364] fatal breakpoint trap in supervisor mode
> > [ 6107.6480364] trap type 1 code 0 rip 0x802219fd cs 0x8 rflags 
> > 0x202 cr2 0x7f7ff7ee5000 ilevel 0 rsp 0xc100c227ae20
> > [ 6107.6480364] curlwp 0xa9521e1b1600 pid 20756.20756 lowest kstack 
> > 0xc100c22772c0
> > [ 6107.6480364] dumping to dev 0,1 (offset=276847, size=2062664):
> > [ 6107.6480364] dump
> > 
> > If any possible further information is needed do not hesitate to
> > contact me!
> > 
> > 
> > Thanks!
> 
> Thank you.  I opened PR 55237 to track so I don't forget.
> 
> Andrew

I am still trying to track this down, but I can only understand small
pieces of the code at the moment. While going through uvm_map_enter
in sys/uvm/uvm_map.c, it looks like there is an unhandled case (patch
below). Is this correct? It seems to happen quite often, but with or
without the patch the system seems equally (un)stable.

-- 
Kind regards,

Yorick Hardy

Index: uvm_map.c
===
RCS file: /cvsroot/src/sys/uvm/uvm_map.c,v
retrieving revision 1.385
diff -u -r1.385 uvm_map.c
--- uvm_map.c   9 Jul 2020 05:57:15 -   1.385
+++ uvm_map.c   19 Nov 2020 16:04:07 -
@@ -1477,6 +1477,13 @@
amapwaitflag | AMAP_EXTEND_BACKWARDS))
goto nomerge;
}
+
+   /*
+* We could not extend either amap, just skip on.
+*/
+   else {
+   goto nomerge;
+   }
} else {
/*
 * Pull the next entry's amap backwards to cover this

Re: PATCH: Relax fdatasync checks to IEEE Std 1003.1-2008

2020-05-24 Thread Yorick Hardy

Thanks!

On 2020-05-24, Paul Ripke wrote:
> On Sun, May 24, 2020 at 06:56:54AM -0400, Greg Troxel wrote:
> > Yorick Hardy  writes:
> > 
> > (I realize you later say this isn't it.)
> > 
> > >> @@ -4141,10 +4140,6 @@ sys_fdatasync(struct lwp *l, const struct 
> > >> sys_fdatasync_args *uap, register_t *r
> > >>  /* fd_getvnode() will use the descriptor for us */
> > >>  if ((error = fd_getvnode(SCARG(uap, fd), )) != 0)
> > >>  return (error);
> > >> -if ((fp->f_flag & FWRITE) == 0) {
> > >> -fd_putfile(SCARG(uap, fd));
> > >> -return (EBADF);
> > >> -}
> > >>  vp = fp->f_vnode;
> > >>  vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
> > >>  error = VOP_FSYNC(vp, fp->f_cred, FSYNC_WAIT|FSYNC_DATAONLY, 0, 
> > >> 0);
> > 
> > If you look at the function beyond what's in the diff, you will see (I
> > think, but I really mean I see) that there is always a single
> > fd_putfile.  This was just doing the put before returning, rather than
> > setting error and the usaul "goto out" where the end-of-routine cleanup
> > happens.  See also sys_fsync_range() in the same file.
> > 
> > I could be reading this wrong.
> 
> I concur - this was just the fd_putfile to match the fd_getfile in
> the early error path for read-only files. The code now falls through
> and calls fd_putfile regardless, to remove the fd reference.
> 
> There should be no real behaviour change here apart from a relaxing
> of the success conditions.
> 
> -- 
> Paul Ripke
> "Great minds discuss ideas, average minds discuss events, small minds
>  discuss people."
> -- Disputed: Often attributed to Eleanor Roosevelt. 1948.

-- 
Kind regards,

Yorick Hardy

Re: PATCH: Relax fdatasync checks to IEEE Std 1003.1-2008

2020-05-24 Thread Yorick Hardy

On 2020-05-24, Yorick Hardy wrote:
> Dear Greg and Paul,
> 
> On 2020-03-25, Paul Ripke wrote:
> > On Mon, Mar 16, 2020 at 08:47:27AM -0400, Greg Troxel wrote:
> > >   [lots of test reports about fdatasync patch]
> > > 
> > > Thanks -- that's enough for me to be comfortable.
> > > and it's been proposed for more than long enough, with no adverse
> > > comments, so I'll commit it soonish.
> > 
> > fwiw, I missed a comment at the top of the function... fixed in
> > attached patch.
> > 
> > -- 
> > Paul Ripke
> > "Great minds discuss ideas, average minds discuss events, small minds
> >  discuss people."
> > -- Disputed: Often attributed to Eleanor Roosevelt. 1948.
> 
> > diff --git a/lib/libc/sys/fdatasync.2 b/lib/libc/sys/fdatasync.2
> > index 3f12119f0dbb..20da609191f5 100644
> > --- a/lib/libc/sys/fdatasync.2
> > +++ b/lib/libc/sys/fdatasync.2
> > @@ -68,7 +68,7 @@ function will fail if:
> >  .It Bq Er EBADF
> >  The
> >  .Fa fd
> > -argument is not a valid file descriptor open for writing.
> > +argument is not a valid file descriptor.
> >  .It Bq Er EINVAL
> >  This implementation does not support synchronized I/O for this file.
> >  .It Bq Er ENOSYS
> > @@ -93,4 +93,4 @@ and outstanding I/O operations are not guaranteed to have 
> > been completed.
> >  The
> >  .Fn fdatasync
> >  function conforms to
> > -.St -p1003.1b-93 .
> > +.St -p1003.1-2008 .
> > diff --git a/sys/kern/vfs_syscalls.c b/sys/kern/vfs_syscalls.c
> > index d51beedbfca9..8cfe5abe6cf8 100644
> > --- a/sys/kern/vfs_syscalls.c
> > +++ b/sys/kern/vfs_syscalls.c
> > @@ -4059,8 +4059,7 @@ sys_fsync(struct lwp *l, const struct sys_fsync_args 
> > *uap, register_t *retval)
> >   * Sync a range of file data.  API modeled after that found in AIX.
> >   *
> >   * FDATASYNC indicates that we need only save enough metadata to be able
> > - * to re-read the written data.  Note we duplicate AIX's requirement that
> > - * the file be open for writing.
> > + * to re-read the written data.
> >   */
> >  /* ARGSUSED */
> >  int
> > @@ -4141,10 +4140,6 @@ sys_fdatasync(struct lwp *l, const struct 
> > sys_fdatasync_args *uap, register_t *r
> > /* fd_getvnode() will use the descriptor for us */
> > if ((error = fd_getvnode(SCARG(uap, fd), )) != 0)
> > return (error);
> > -   if ((fp->f_flag & FWRITE) == 0) {
> > -   fd_putfile(SCARG(uap, fd));
> > -   return (EBADF);
> > -   }
> > vp = fp->f_vnode;
> > vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
> > error = VOP_FSYNC(vp, fp->f_cred, FSYNC_WAIT|FSYNC_DATAONLY, 0, 0);
> 
> On 2020-03-25, Greg Troxel wrote:
> > Paul Ripke  writes:
> > 
> > > On Mon, Mar 16, 2020 at 08:47:27AM -0400, Greg Troxel wrote:
> > >>   [lots of test reports about fdatasync patch]
> > >> 
> > >> Thanks -- that's enough for me to be comfortable.
> > >> and it's been proposed for more than long enough, with no adverse
> > >> comments, so I'll commit it soonish.
> > 
> > > fwiw, I missed a comment at the top of the function... fixed in
> > > attached patch.
> > 
> > I have committed your patch, exactly as you just sent it.  My full
> > release build worked and I have an anita test run in process, just in
> > case.
> > 
> > Thanks for perservering on this.  It takes many people to fix all the
> > loose ends in an operating system!
> 
> I have been trying to find the cause of PR kern/55237. I am not at all
> familiar with the code, so please forgive me for pointing fingers!
> 
> I think the call to fd_putfile results in a close of the fd, but that
> does not happen anymore? Should it?
> 
> Apologies again if this has nothing to do with kern/55237.

It was not the casue of kern/55237, apologies!

-- 
Kind regards,

Yorick Hardy

Re: PATCH: Relax fdatasync checks to IEEE Std 1003.1-2008

2020-05-24 Thread Yorick Hardy

Dear Greg and Paul,

On 2020-03-25, Paul Ripke wrote:
> On Mon, Mar 16, 2020 at 08:47:27AM -0400, Greg Troxel wrote:
> >   [lots of test reports about fdatasync patch]
> > 
> > Thanks -- that's enough for me to be comfortable.
> > and it's been proposed for more than long enough, with no adverse
> > comments, so I'll commit it soonish.
> 
> fwiw, I missed a comment at the top of the function... fixed in
> attached patch.
> 
> -- 
> Paul Ripke
> "Great minds discuss ideas, average minds discuss events, small minds
>  discuss people."
> -- Disputed: Often attributed to Eleanor Roosevelt. 1948.

> diff --git a/lib/libc/sys/fdatasync.2 b/lib/libc/sys/fdatasync.2
> index 3f12119f0dbb..20da609191f5 100644
> --- a/lib/libc/sys/fdatasync.2
> +++ b/lib/libc/sys/fdatasync.2
> @@ -68,7 +68,7 @@ function will fail if:
>  .It Bq Er EBADF
>  The
>  .Fa fd
> -argument is not a valid file descriptor open for writing.
> +argument is not a valid file descriptor.
>  .It Bq Er EINVAL
>  This implementation does not support synchronized I/O for this file.
>  .It Bq Er ENOSYS
> @@ -93,4 +93,4 @@ and outstanding I/O operations are not guaranteed to have 
> been completed.
>  The
>  .Fn fdatasync
>  function conforms to
> -.St -p1003.1b-93 .
> +.St -p1003.1-2008 .
> diff --git a/sys/kern/vfs_syscalls.c b/sys/kern/vfs_syscalls.c
> index d51beedbfca9..8cfe5abe6cf8 100644
> --- a/sys/kern/vfs_syscalls.c
> +++ b/sys/kern/vfs_syscalls.c
> @@ -4059,8 +4059,7 @@ sys_fsync(struct lwp *l, const struct sys_fsync_args 
> *uap, register_t *retval)
>   * Sync a range of file data.  API modeled after that found in AIX.
>   *
>   * FDATASYNC indicates that we need only save enough metadata to be able
> - * to re-read the written data.  Note we duplicate AIX's requirement that
> - * the file be open for writing.
> + * to re-read the written data.
>   */
>  /* ARGSUSED */
>  int
> @@ -4141,10 +4140,6 @@ sys_fdatasync(struct lwp *l, const struct 
> sys_fdatasync_args *uap, register_t *r
>   /* fd_getvnode() will use the descriptor for us */
>   if ((error = fd_getvnode(SCARG(uap, fd), )) != 0)
>   return (error);
> - if ((fp->f_flag & FWRITE) == 0) {
> - fd_putfile(SCARG(uap, fd));
> - return (EBADF);
> - }
>   vp = fp->f_vnode;
>   vn_lock(vp, LK_EXCLUSIVE | LK_RETRY);
>   error = VOP_FSYNC(vp, fp->f_cred, FSYNC_WAIT|FSYNC_DATAONLY, 0, 0);

On 2020-03-25, Greg Troxel wrote:
> Paul Ripke  writes:
> 
> > On Mon, Mar 16, 2020 at 08:47:27AM -0400, Greg Troxel wrote:
> >>   [lots of test reports about fdatasync patch]
> >> 
> >> Thanks -- that's enough for me to be comfortable.
> >> and it's been proposed for more than long enough, with no adverse
> >> comments, so I'll commit it soonish.
> 
> > fwiw, I missed a comment at the top of the function... fixed in
> > attached patch.
> 
> I have committed your patch, exactly as you just sent it.  My full
> release build worked and I have an anita test run in process, just in
> case.
> 
> Thanks for perservering on this.  It takes many people to fix all the
> loose ends in an operating system!

I have been trying to find the cause of PR kern/55237. I am not at all
familiar with the code, so please forgive me for pointing fingers!

I think the call to fd_putfile results in a close of the fd, but that
does not happen anymore? Should it?

Apologies again if this has nothing to do with kern/55237.

-- 
Kind regards,

Yorick Hardy

Re: Panic: vrelel: bad ref count (9.99.54)

2020-04-20 Thread Yorick Hardy

Dear Andrew,

On 2020-04-19, Andrew Doran wrote:
> Hi Yorick.
> 
> On Sat, Apr 18, 2020 at 11:00:02AM +0200, Yorick Hardy wrote:
> 
> > > I just had the same panic with 9.99.55:
> > > 
> > >   Crash version 9.99.55, image version 9.99.55.
> > >   crash: _kvm_kvatop(0)
> > >   Kernel compiled without options LOCKDEBUG.
> > >   System panicked: vrelel: bad ref count
> > >   Backtrace from time of crash is available.
> > >   crash> bt
> > >   _KERNEL_OPT_NAGR() at 0
> > >   ?() at 7f7ff7ecf000
> > >   sys_reboot() at sys_reboot
> > >   vpanic() at vpanic+0x181
> > >   vtryrele() at vtryrele
> > >   vcache_dealloc() at vcache_dealloc
> > >   uvm_unmap_detach() at uvm_unmap_detach+0x76
> > >   uvm_unmap1() at uvm_unmap1+0x4e
> > >   uvm_mremap() at uvm_mremap+0x36b
> > >   sys_mremap() at sys_mremap+0x68
> > >   syscall() at syscall+0x227
> > >   --- syscall (number 411) ---
> > >   797459842e9a:
> > >   crash>
> > 
> > The same just happened on 9.99.56 while fetching (POP) mail using mail/fdm.
> 
> Could you file a PR please?  If this panics again could you please run the
> "dmesg" command in crash and find out what it printed about the vnode?  That
> would be very useful.
> 
> Thanks,
> Andrew

I will do so (... perhaps only this weekend).

-- 
Kind regards,

Yorick Hardy

Re: Panic: vrelel: bad ref count (9.99.54)

2020-04-18 Thread Yorick Hardy

Dea Andrew,

On 2020-04-16, Yorick Hardy wrote:
> Dear Andrew,
> 
> On 2020-04-08, Yorick Hardy wrote:
> > Dear Andrew,
> > 
> > On 2020-04-07, Yorick Hardy wrote:
> > > Dear Andrew,
> > > 
> > > On 2020-04-07, Andrew Doran wrote:
> > > > Hi Yorick.
> > > > 
> > > > On Mon, Apr 06, 2020 at 11:16:37PM +0200, Yorick Hardy wrote:
> > > > 
> > > > >Crash version 9.99.54, image version 9.99.54.
> > > > >crash: _kvm_kvatop(0)
> > > > >Kernel compiled without options LOCKDEBUG.
> > > > >System panicked: vrelel: bad ref count
> > > > >Backtrace from time of crash is available.
> > > > >crash> bt
> > > > >_KERNEL_OPT_NAGR() at 0
> > > > >?() at 7f7ff7ecf000
> > > > >sys_reboot() at sys_reboot
> > > > >vpanic() at vpanic+0x181
> > > > >vtryrele() at vtryrele
> > > > >vcache_dealloc() at vcache_dealloc
> > > > >uvm_unmap_detach() at uvm_unmap_detach+0x76
> > > > >uvm_unmap1() at uvm_unmap1+0x4e
> > > > >uvm_mremap() at uvm_mremap+0x36b
> > > > >sys_mremap() at sys_mremap+0x68
> > > > >syscall() at syscall+0x227
> > > > >--- syscall (number 411) ---
> > > > >7f0af7842e9a:
> > > > 
> > > > Were you running anything noteworthy at the time?  There is a very good
> > > > chance that is fixed by revision 1.217 of src/sys/kern/vfs_lookup.c.
> > > > 
> > > > Thanks,
> > > > Andrew
> > > 
> > > Thanks! It happens often when running wip/urlwatch, which keeps
> > > the disk quite busy. I will try a new kernel as soon as I can free
> > > up my computer!
> > 
> > Initial tests are unable to trigger the panic, so it looks promising!
> > Let's assume it is fixed, I will continue to use this kernel and report
> > back if necessary.
> > 
> > Thanks again!
> 
> I just had the same panic with 9.99.55:
> 
>   Crash version 9.99.55, image version 9.99.55.
>   crash: _kvm_kvatop(0)
>   Kernel compiled without options LOCKDEBUG.
>   System panicked: vrelel: bad ref count
>   Backtrace from time of crash is available.
>   crash> bt
>   _KERNEL_OPT_NAGR() at 0
>   ?() at 7f7ff7ecf000
>   sys_reboot() at sys_reboot
>   vpanic() at vpanic+0x181
>   vtryrele() at vtryrele
>   vcache_dealloc() at vcache_dealloc
>   uvm_unmap_detach() at uvm_unmap_detach+0x76
>   uvm_unmap1() at uvm_unmap1+0x4e
>   uvm_mremap() at uvm_mremap+0x36b
>   sys_mremap() at sys_mremap+0x68
>   syscall() at syscall+0x227
>   --- syscall (number 411) ---
>   797459842e9a:
>   crash>

The same just happened on 9.99.56 while fetching (POP) mail using mail/fdm.

I have 9.99.42 running without issues, I have not had the time to bisect 
further.

-- 
Kind regards,

Yorick Hardy

Re: Panic: vrelel: bad ref count (9.99.54)

2020-04-16 Thread Yorick Hardy

Dear Andrew,

On 2020-04-08, Yorick Hardy wrote:
> Dear Andrew,
> 
> On 2020-04-07, Yorick Hardy wrote:
> > Dear Andrew,
> > 
> > On 2020-04-07, Andrew Doran wrote:
> > > Hi Yorick.
> > > 
> > > On Mon, Apr 06, 2020 at 11:16:37PM +0200, Yorick Hardy wrote:
> > > 
> > > >Crash version 9.99.54, image version 9.99.54.
> > > >crash: _kvm_kvatop(0)
> > > >Kernel compiled without options LOCKDEBUG.
> > > >System panicked: vrelel: bad ref count
> > > >Backtrace from time of crash is available.
> > > >crash> bt
> > > >_KERNEL_OPT_NAGR() at 0
> > > >?() at 7f7ff7ecf000
> > > >sys_reboot() at sys_reboot
> > > >vpanic() at vpanic+0x181
> > > >vtryrele() at vtryrele
> > > >vcache_dealloc() at vcache_dealloc
> > > >uvm_unmap_detach() at uvm_unmap_detach+0x76
> > > >uvm_unmap1() at uvm_unmap1+0x4e
> > > >uvm_mremap() at uvm_mremap+0x36b
> > > >sys_mremap() at sys_mremap+0x68
> > > >syscall() at syscall+0x227
> > > >--- syscall (number 411) ---
> > > >7f0af7842e9a:
> > > 
> > > Were you running anything noteworthy at the time?  There is a very good
> > > chance that is fixed by revision 1.217 of src/sys/kern/vfs_lookup.c.
> > > 
> > > Thanks,
> > > Andrew
> > 
> > Thanks! It happens often when running wip/urlwatch, which keeps
> > the disk quite busy. I will try a new kernel as soon as I can free
> > up my computer!
> 
> Initial tests are unable to trigger the panic, so it looks promising!
> Let's assume it is fixed, I will continue to use this kernel and report
> back if necessary.
> 
> Thanks again!

I just had the same panic with 9.99.55:

  Crash version 9.99.55, image version 9.99.55.
  crash: _kvm_kvatop(0)
  Kernel compiled without options LOCKDEBUG.
  System panicked: vrelel: bad ref count
  Backtrace from time of crash is available.
  crash> bt
  _KERNEL_OPT_NAGR() at 0
  ?() at 7f7ff7ecf000
  sys_reboot() at sys_reboot
  vpanic() at vpanic+0x181
  vtryrele() at vtryrele
  vcache_dealloc() at vcache_dealloc
  uvm_unmap_detach() at uvm_unmap_detach+0x76
  uvm_unmap1() at uvm_unmap1+0x4e
  uvm_mremap() at uvm_mremap+0x36b
  sys_mremap() at sys_mremap+0x68
  syscall() at syscall+0x227
  --- syscall (number 411) ---
  797459842e9a:
  crash>

but I am not sure what caused it. I will update and try again.

-- 
Kind regards,

Yorick Hardy

Re: Panic: vrelel: bad ref count (9.99.54)

2020-04-07 Thread Yorick Hardy

Dear Andrew,

On 2020-04-07, Yorick Hardy wrote:
> Dear Andrew,
> 
> On 2020-04-07, Andrew Doran wrote:
> > Hi Yorick.
> > 
> > On Mon, Apr 06, 2020 at 11:16:37PM +0200, Yorick Hardy wrote:
> > 
> > >Crash version 9.99.54, image version 9.99.54.
> > >crash: _kvm_kvatop(0)
> > >Kernel compiled without options LOCKDEBUG.
> > >System panicked: vrelel: bad ref count
> > >Backtrace from time of crash is available.
> > >crash> bt
> > >_KERNEL_OPT_NAGR() at 0
> > >?() at 7f7ff7ecf000
> > >sys_reboot() at sys_reboot
> > >vpanic() at vpanic+0x181
> > >vtryrele() at vtryrele
> > >vcache_dealloc() at vcache_dealloc
> > >uvm_unmap_detach() at uvm_unmap_detach+0x76
> > >uvm_unmap1() at uvm_unmap1+0x4e
> > >uvm_mremap() at uvm_mremap+0x36b
> > >sys_mremap() at sys_mremap+0x68
> > >syscall() at syscall+0x227
> > >--- syscall (number 411) ---
> > >7f0af7842e9a:
> > 
> > Were you running anything noteworthy at the time?  There is a very good
> > chance that is fixed by revision 1.217 of src/sys/kern/vfs_lookup.c.
> > 
> > Thanks,
> > Andrew
> 
> Thanks! It happens often when running wip/urlwatch, which keeps
> the disk quite busy. I will try a new kernel as soon as I can free
> up my computer!

Initial tests are unable to trigger the panic, so it looks promising!
Let's assume it is fixed, I will continue to use this kernel and report
back if necessary.

Thanks again!

-- 
Kind regards,

Yorick Hardy

Re: Panic: vrelel: bad ref count (9.99.54)

2020-04-07 Thread Yorick Hardy

Dear Andrew,

On 2020-04-07, Andrew Doran wrote:
> Hi Yorick.
> 
> On Mon, Apr 06, 2020 at 11:16:37PM +0200, Yorick Hardy wrote:
> 
> >Crash version 9.99.54, image version 9.99.54.
> >crash: _kvm_kvatop(0)
> >Kernel compiled without options LOCKDEBUG.
> >System panicked: vrelel: bad ref count
> >Backtrace from time of crash is available.
> >crash> bt
> >_KERNEL_OPT_NAGR() at 0
> >?() at 7f7ff7ecf000
> >sys_reboot() at sys_reboot
> >vpanic() at vpanic+0x181
> >vtryrele() at vtryrele
> >vcache_dealloc() at vcache_dealloc
> >uvm_unmap_detach() at uvm_unmap_detach+0x76
> >uvm_unmap1() at uvm_unmap1+0x4e
> >uvm_mremap() at uvm_mremap+0x36b
> >sys_mremap() at sys_mremap+0x68
> >syscall() at syscall+0x227
> >--- syscall (number 411) ---
> >7f0af7842e9a:
> 
> Were you running anything noteworthy at the time?  There is a very good
> chance that is fixed by revision 1.217 of src/sys/kern/vfs_lookup.c.
> 
> Thanks,
> Andrew

Thanks! It happens often when running wip/urlwatch, which keeps
the disk quite busy. I will try a new kernel as soon as I can free
up my computer!

-- 
Kind regards,

Yorick Hardy

Panic: vrelel: bad ref count (9.99.54)

2020-04-06 Thread Yorick Hardy

Dear current users,

Has anyone else seen this?

   Crash version 9.99.54, image version 9.99.54.
   crash: _kvm_kvatop(0)
   Kernel compiled without options LOCKDEBUG.
   System panicked: vrelel: bad ref count
   Backtrace from time of crash is available.
   crash> bt
   _KERNEL_OPT_NAGR() at 0
   ?() at 7f7ff7ecf000
   sys_reboot() at sys_reboot
   vpanic() at vpanic+0x181
   vtryrele() at vtryrele
   vcache_dealloc() at vcache_dealloc
   uvm_unmap_detach() at uvm_unmap_detach+0x76
   uvm_unmap1() at uvm_unmap1+0x4e
   uvm_mremap() at uvm_mremap+0x36b
   sys_mremap() at sys_mremap+0x68
   syscall() at syscall+0x227
   --- syscall (number 411) ---
   7f0af7842e9a:
   crash>

-- 
Kind regards,

Yorick Hardy

Re: Audio recording (using ossaudio)

2020-03-20 Thread Yorick Hardy

Dear Tetsuya,

On 2020-03-20, Tetsuya Isaki wrote:
> At Fri, 20 Mar 2020 08:08:37 +0200,
> Yorick Hardy wrote:
> > It seems to be stuck in select (or poll, I did not check the source)
> > in portaudio.
> 
> Yeah, I'm just looking this in this week.
> poll()/select() before read() doesn't work correctly now.
> I will fix it.
> 
> > Updating audio/portaudio from portaudio-190600.20161030nb1 to 
> > portaudio-190600.20161030nb2
> > fixes the problem (maybe because of the patch to disable non-blocking I/O 
> > ?).
> 
> I looked it right now.  And it looks bad strategy.
> He should have reported it first...
> 
> > Now 44100 MHz does not sound right (I will send the example off-list), but
> > 48000 MHz is fine (this is the same behaviour as audiorecord).
> 
> I heard it but unfortunately I don't know expected status.
> It sounded like analog noise or environmental noise.
> 
> 
> Anyway, if you want to record with pure 44100Hz, you need to
> set the hardware 44100Hz mode using audiocfg(1) command:
>  # audiocfg set  r slinear_le 16 2 44100
> 
> On NetBSD7 (or prior), if you record 44100Hz, the kernel set
> the hardware 44100Hz, because it was single audio system.
> 
> On NetBSD9 (or later), multiple recorder apps can be run
> simultaneously.  So even if your single app want to record
> 44100Hz, the kernel can not change the hardware frequency.
> The kernel converts from the hardware frequency to your
> requested frequency (if different).
> In-kernel frequency conversion is simple (and fast and small)
> than what userland rich apps does (and I personally think that
> such rich operation should be done by userland).
> 
> You need to a)change the hardware format (by audiocfg) or
> b)record as the hardware format ("audiocfg list" displays)
> and convert it by userland rich converter.
> 
> Thanks,
> ---
> Tetsuya Isaki 

Thanks! I think Nia mentioned this also, but somehow
I did not fully understand the role of audiocfg.

-- 
Kind regards,

Yorick Hardy

Re: Audio recording (using ossaudio)

2020-03-19 Thread Yorick Hardy

(Oops: forgot to Cc the list.)

Dear Tetsuya,

On 2020-03-20, Tetsuya Isaki wrote:
> At Thu, 19 Mar 2020 21:36:00 +0200,
> Yorick Hardy wrote:
> > > >  ffmpeg4 -f oss -i /dev/audio -channels 1 -sample_rate 48000 
> > > > /tmp/test.wav
> > > > 
> > > > is completely garbled and too short. The file also seems to be 
> > > > 2-channel,
> > > > so I think the recording settings are somehow not applied correctly.
> > > 
> > > I rarely use ffmpeg4 but according to ffmpeg4 documents,
> > > -channels/-sample_rate are for video and -ac/-ar are for audio?
> > 
> > If I used it correctly, it is for "-f oss" so for the input. Maybe it
> > should go before "-i", but if I recall correctly it does not make
> > a difference.
> > 
> > I think "-ac" is for the output format (ffmpeg performs appropriate
> > conversion).
> 
> I see, sorry for noise.

No problem at all!

> > > As you said, output file was too short.  However ffmpeg4 probably
> > > recorded specified period and created small file so that I think
> > > you need to look ffmpeg4 at first.
> > 
> > I did not figure out why the file is too short, but there is some
> > oss/ffmpeg interaction (maybe due to non-blocking reads?) which causes
> > this.
> 
> It's hard to believe that non-blocking read affects.
> I've implemented non-blocking i/o carefully too.  If you find how
> to reproduce the problem, please send PR.

I will try again to reproduce the problem with a minimal
program. It was not easy to reproduce the ffmpeg problem
(I am not sure why, maybe its is a combined pts calculation
and non-blocking read problem).

> By the way, in ffmpeg-4.2.1/libavdevice/oss_dec.c:
>   70 static int audio_read_packet(AVFormatContext *s1, AVPacket *pkt)
>   71 {
>   :
>   95 /* subtract time represented by the number of bytes in the audio fif
>   96 cur_time -= (bdelay * 100LL) / (s->sample_rate * s->channels);
> 
> I wonder why this calculation doesn't have precision (16bits = 2bytes).
> I modified this line but it did not seem to improve.  But I didn't
> chase more.

I am sure the ffmpeg calculation is wrong, but (as you say) changing it 
does not fix the problem.

-- 
Kind regards,

Yorick Hardy

Re: Audio recording (using ossaudio)

2020-03-19 Thread Yorick Hardy

Dear Tetsuya,

On 2020-03-19, Tetsuya Isaki wrote:
> At Sat, 14 Mar 2020 15:05:37 +0200,
> Yorick Hardy wrote:
> > Re: audacity (earlier in the thread), audacity hangs whenever I try to
> > record.
> 
> Would you tell how to reproduce it?

First: my pkgsrc is not up to date -- so there could be some other reason for 
this.

To reproduce:

 1) install audio/audacity (I have audacity-2.3.3nb2)
 2) start audacity and press the record button!

and audacity becomes unresponsive (and also does not repaint when uncovered).

> Thanks,
> ---
> Tetsuya Isaki 

-- 
Kind regards,

Yorick Hardy

Re: Audio recording (using ossaudio)

2020-03-19 Thread Yorick Hardy

Dear Tetsuya,

On 2020-03-19, Tetsuya Isaki wrote:
> At Tue, 10 Mar 2020 20:49:55 +0200,
> Yorick Hardy wrote:
> >  ffmpeg4 -f oss -i /dev/audio -channels 1 -sample_rate 48000 /tmp/test.wav
> > 
> > is completely garbled and too short. The file also seems to be 2-channel,
> > so I think the recording settings are somehow not applied correctly.
> 
> I rarely use ffmpeg4 but according to ffmpeg4 documents,
> -channels/-sample_rate are for video and -ac/-ar are for audio?

If I used it correctly, it is for "-f oss" so for the input. Maybe it
should go before "-i", but if I recall correctly it does not make
a difference.

I think "-ac" is for the output format (ffmpeg performs appropriate
conversion).

>  % /usr/bin/time ffmpeg4 -f oss -t 0:05 -i /dev/audio -channels 1 test1.wav
>  5.04 real 0.02 user 0.04 sys
>  % file test1.wav
>  test1.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, 
> stereo 48000 Hz
> 
>  % /usr/bin/time ffmpeg4 -f oss -t 0:05 -i /dev/audio -ac 1 test2.wav
>  5.04 real 0.04 user 0.02 sys
>  % file test2.wav
>  test2.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, 
> mono 48000 Hz
> 
>  % /usr/bin/time audioplay test1.wav
>  2.54 real 0.00 user 0.00 sys
>  % /usr/bin/time audioplay test2.wav
>  2.54 real 0.00 user 0.00 sys
> 
> As you said, output file was too short.  However ffmpeg4 probably
> recorded specified period and created small file so that I think
> you need to look ffmpeg4 at first.

I did not figure out why the file is too short, but there is some
oss/ffmpeg interaction (maybe due to non-blocking reads?) which causes
this.

I created wip/ffmpeg4-nbsdaudio and nia improved it, and now recording
works fine:

 ffmpeg4 -f nbsdaudio -i /dev/audio /tmp/netbsd.wav

records as expected.

> Thanks,
> ---
> Tetsuya Isaki 

Thank you!

-- 
Kind regards,

Yorick Hardy

Re: Audio recording (using ossaudio)

2020-03-15 Thread Yorick Hardy

Dear nia,

On 2020-03-14, nia wrote:
> On Sat, Mar 14, 2020 at 12:20:11AM +0200, Yorick Hardy wrote:
> > You are correct. I threw together a NetBSD audio driver based on the oss
> > driver, but it had exactly the same problem. Strangely, I have been unable 
> > to
> > reproduce the problem on an old i386 netbook (so far).
> > 
> > I wrote a test program to try and reproduce what ffmpeg is doing, and
> > (I am not sure yet) it seems like non-blocking reads is causing the
> > distortion. The same test program with blocking reads seems to work
> > okay.
> > 
> > I will look into it a bit more, and then report back.
> 
> Right, /dev/audio doesn't support non-blocking I/O. But you're supposed to
> do short enough reads and writes that it shouldn't matter. That might be
> the cause of the worst of the problems.

Oops, I think the man page might need to be updated then.
I managed to convince my test program to correctly record
with non-blocking I/O (perhaps by accident?) by working
a bit differently to ffmpeg, but I am not sure how to
adjust ffmpeg in this way. I will try blocking reads
(presumably reading blocksize bytes at a time).

Re: audacity (earlier in the thread), audacity hangs whenever I try to
record. I probably need to update all of my packages - but I am not
doing any long builds at the moment due to unpredictable electricity
supply!

> Do you want to work on this together somewhere?

Yes, that would be great! As long as you don't mind someone who is
extremely unresponsive most of the week! I have quite a few deadlines
in the next week, and will probably ignore most things while I am
doing that work.

Attached are the patches for my "testing" version of the ffmpeg
backend (heavily based on the OSS backend). I am sure it should be
renamed to "netbsd", initially I was trying for sun compatibility -
but I am not sure that makes sense.

-- 
Kind regards,

Yorick Hardy
$NetBSD$

--- configure.orig  2019-08-05 21:11:40.0 +
+++ configure
@@ -2115,6 +2115,7 @@ HEADERS_LIST="
 opencv2_core_core_c_h
 OpenGL_gl3_h
 poll_h
+sys_audioio_h
 sys_param_h
 sys_resource_h
 sys_select_h
@@ -3306,6 +3307,8 @@ android_camera_indev_deps="android camer
 android_camera_indev_extralibs="-landroid -lcamera2ndk -lmediandk"
 alsa_indev_deps="alsa"
 alsa_outdev_deps="alsa"
+audioio_indev_deps_any="sys_audioio_h"
+audioio_outdev_deps_any="sys_audioio_h"
 avfoundation_indev_deps="avfoundation corevideo coremedia pthreads"
 avfoundation_indev_suggest="coregraphics applicationservices"
 avfoundation_indev_extralibs="-framework Foundation"
@@ -6461,6 +6464,10 @@ check_headers "dev/bktr/ioctl_meteor.h d
 check_headers "dev/video/meteor/ioctl_meteor.h 
dev/video/bktr/ioctl_bt848.h" ||
 check_headers "dev/ic/bt8xx.h"
 
+if check_struct sys/audioio.h audio_info_t play; then
+enable_sanitized sys/audioio.h
+fi
+
 if check_struct sys/soundcard.h audio_buf_info bytes; then
 enable_sanitized sys/soundcard.h
 else


patch-doc_indevs.texi
Description: TeXInfo document


patch-doc_outdevs.texi
Description: TeXInfo document
$NetBSD$

--- libavdevice/Makefile.orig   2019-08-05 20:52:21.0 +
+++ libavdevice/Makefile
@@ -15,6 +15,8 @@ OBJS-$(CONFIG_SHARED)   
 OBJS-$(CONFIG_ALSA_INDEV)+= alsa_dec.o alsa.o timefilter.o
 OBJS-$(CONFIG_ALSA_OUTDEV)   += alsa_enc.o alsa.o
 OBJS-$(CONFIG_ANDROID_CAMERA_INDEV)  += android_camera.o
+OBJS-$(CONFIG_AUDIOIO_INDEV) += audioio_dec.o audioio.o
+OBJS-$(CONFIG_AUDIOIO_OUTDEV)+= audioio_enc.o audioio.o
 OBJS-$(CONFIG_AVFOUNDATION_INDEV)+= avfoundation.o
 OBJS-$(CONFIG_BKTR_INDEV)+= bktr.o
 OBJS-$(CONFIG_CACA_OUTDEV)   += caca.o
$NetBSD$

--- libavdevice/alldevices.c.orig   2019-08-05 20:52:21.0 +
+++ libavdevice/alldevices.c
@@ -27,6 +27,8 @@
 extern AVInputFormat  ff_alsa_demuxer;
 extern AVOutputFormat ff_alsa_muxer;
 extern AVInputFormat  ff_android_camera_demuxer;
+extern AVInputFormat  ff_audioio_demuxer;
+extern AVOutputFormat ff_audioio_muxer;
 extern AVInputFormat  ff_avfoundation_demuxer;
 extern AVInputFormat  ff_bktr_demuxer;
 extern AVOutputFormat ff_caca_muxer;
--- /dev/null   2020-03-11 11:29:21.727010528 +0200
+++ libavdevice/audioio.c   2020-03-11 11:41:43.710391401 +0200
@@ -0,0 +1,125 @@
+/*
+ * Sun and NetBSD play and grab interface
+ * Copyright (c) 2020 Yorick Hardy
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is

Re: current: completely stuck after four minutes of uptime

2020-03-15 Thread Yorick Hardy

On 2020-03-15, Chavdar Ivanov wrote:
> Hi,
> 
> On Sun, 15 Mar 2020 at 11:07, Chavdar Ivanov  wrote:
> >
> > On Sun, 15 Mar 2020 at 10:29, Thomas Klausner  wrote:
> > >
> > > Hi!
> > >
> > > I've just upgraded my 9.99.49 kernel from March 12 to today's from an
> > > hour ago.
> > >
> > > After rebooting, the machine got stuck in less than five minutes.
> > >
> > > No reaction to CTRL-ALT-ESC from the console, no reaction to pressing
> > > the power button.
> >
> > Mine is from
> >
> > NetBSD 9.99.49 (GENERIC) #1: Sun Mar 15 02:33:56 GMT 2020
> >
> > and works just fine; upgraded three machines without any problem.
> 
> I was somewhat quick to conclude that. One of the upgraded machines, a
> VirtualBox guest running GENERIC_KASLR, after perhaps 3 hours appeared
> stuck and unresponsive; I couldn't get into the debugger either. The
> other physical box is still working, though (and rebuilding).
> 
> >
> > >
> > >  Thomas


Maybe completely unrelated, but a kernel compiled yesterday hangs
while booting for me. After pressing the power button, the debugger
stops in "usb_disconnect_port".

-- 
Kind regards,

Yorick Hardy

Re: change within last day broke nvmm

2020-03-15 Thread Yorick Hardy

Dear Tobias,

On 2020-03-15, Tobias Nygren wrote:
> Hi,
> 
> This is consistently reproducable while trying to boot Linux on nvmm.
> 
> panic: LIST_INSERT_HEAD 0x88713368 x86/pmap.c:2135
> vpanic()
> panic()
> pmap_enter_pv()
> pmap_ept_enter()
> uvm_fault_lower_enter()
> uvm_fault_internal()
> nvmm_ioctl()
> sys_ioctl()
> syscall()
> 
> -Tobias

I think Maxime would like to be Cc'd on NVMM issues, so I am doing that here.

-- 
Kind regards,

Yorick Hardy

Re: Audio recording (using ossaudio)

2020-03-14 Thread Yorick Hardy

Dear nia,

On 2020-03-14, Yorick Hardy wrote:
> On 2020-03-14, Yorick Hardy wrote:
> > Dear nia,
> > 
> > On 2020-03-14, nia wrote:
> > > On Sat, Mar 14, 2020 at 12:20:11AM +0200, Yorick Hardy wrote:
> > > > You are correct. I threw together a NetBSD audio driver based on the oss
> > > > driver, but it had exactly the same problem. Strangely, I have been 
> > > > unable to
> > > > reproduce the problem on an old i386 netbook (so far).
> > > > 
> > > > I wrote a test program to try and reproduce what ffmpeg is doing, and
> > > > (I am not sure yet) it seems like non-blocking reads is causing the
> > > > distortion. The same test program with blocking reads seems to work
> > > > okay.
> > > > 
> > > > I will look into it a bit more, and then report back.
> > > 
> > > Right, /dev/audio doesn't support non-blocking I/O. But you're supposed to
> > > do short enough reads and writes that it shouldn't matter. That might be
> > > the cause of the worst of the problems.
> > 
> > Oops, I think the man page might need to be updated then.
> > I managed to convince my test program to correctly record
> > with non-blocking I/O (perhaps by accident?) by working
> > a bit differently to ffmpeg, but I am not sure how to
> > adjust ffmpeg in this way. I will try blocking reads
> > (presumably reading blocksize bytes at a time).
> > 
> > Re: audacity (earlier in the thread), audacity hangs whenever I try to
> > record. I probably need to update all of my packages - but I am not
> > doing any long builds at the moment due to unpredictable electricity
> > supply!
> > 
> > > Do you want to work on this together somewhere?
> > 
> > Yes, that would be great! As long as you don't mind someone who is
> > extremely unresponsive most of the week! I have quite a few deadlines
> > in the next week, and will probably ignore most things while I am
> > doing that work.
> > 
> > Attached are the patches for my "testing" version of the ffmpeg
> > backend (heavily based on the OSS backend). I am sure it should be
> > renamed to "netbsd", initially I was trying for sun compatibility -
> > but I am not sure that makes sense.
> 
> Just a side note that I keep forgetting to mention: I used ffmpeg
> and oss to record videos over the last few years and "it used to
> work fine". I think that the most recent audio changes have
> broken some expectations that ffmpeg has, but it used to work
> (more or less) as ffmpeg expected.
> 
> That said, I think a netbsd audio backend would be great. I will
> create a pkgsrc-wip package in the mean time to start working on
> a netbsd audio backend, unless someone beats me to it!
> 
> [wip/ags also has some audio problems, it uses allegro for audio;
>  but I will start another thread about that one day.]

I have imported wip/ffmpeg4-nbsdaudio which, thanks to your comments,
now manages some simple recording and playback.

I am sure many improvements are need to be made, but I am happy that I can
record videos again if I need to!

-- 
Kind regards,

Yorick Hardy

Re: Audio recording (using ossaudio)

2020-03-14 Thread Yorick Hardy

On 2020-03-14, Yorick Hardy wrote:
> Dear nia,
> 
> On 2020-03-14, nia wrote:
> > On Sat, Mar 14, 2020 at 12:20:11AM +0200, Yorick Hardy wrote:
> > > You are correct. I threw together a NetBSD audio driver based on the oss
> > > driver, but it had exactly the same problem. Strangely, I have been 
> > > unable to
> > > reproduce the problem on an old i386 netbook (so far).
> > > 
> > > I wrote a test program to try and reproduce what ffmpeg is doing, and
> > > (I am not sure yet) it seems like non-blocking reads is causing the
> > > distortion. The same test program with blocking reads seems to work
> > > okay.
> > > 
> > > I will look into it a bit more, and then report back.
> > 
> > Right, /dev/audio doesn't support non-blocking I/O. But you're supposed to
> > do short enough reads and writes that it shouldn't matter. That might be
> > the cause of the worst of the problems.
> 
> Oops, I think the man page might need to be updated then.
> I managed to convince my test program to correctly record
> with non-blocking I/O (perhaps by accident?) by working
> a bit differently to ffmpeg, but I am not sure how to
> adjust ffmpeg in this way. I will try blocking reads
> (presumably reading blocksize bytes at a time).
> 
> Re: audacity (earlier in the thread), audacity hangs whenever I try to
> record. I probably need to update all of my packages - but I am not
> doing any long builds at the moment due to unpredictable electricity
> supply!
> 
> > Do you want to work on this together somewhere?
> 
> Yes, that would be great! As long as you don't mind someone who is
> extremely unresponsive most of the week! I have quite a few deadlines
> in the next week, and will probably ignore most things while I am
> doing that work.
> 
> Attached are the patches for my "testing" version of the ffmpeg
> backend (heavily based on the OSS backend). I am sure it should be
> renamed to "netbsd", initially I was trying for sun compatibility -
> but I am not sure that makes sense.

Just a side note that I keep forgetting to mention: I used ffmpeg
and oss to record videos over the last few years and "it used to
work fine". I think that the most recent audio changes have
broken some expectations that ffmpeg has, but it used to work
(more or less) as ffmpeg expected.

That said, I think a netbsd audio backend would be great. I will
create a pkgsrc-wip package in the mean time to start working on
a netbsd audio backend, unless someone beats me to it!

[wip/ags also has some audio problems, it uses allegro for audio;
 but I will start another thread about that one day.]

-- 
Kind regards,

Yorick Hardy

Re: Audio recording (using ossaudio)

2020-03-13 Thread Yorick Hardy

Dear nia,

On 2020-03-13, nia wrote:
> On Tue, Mar 10, 2020 at 08:49:55PM +0200, Yorick Hardy wrote:
> > Can anyone else record audio correctly via ossaudio?
> > audiorecord seems to work as long as the frequency
> > divides the native frequency (see dmesg excerpt below)
> 
> (I missed this post, but got contacted about it directly off-list. I'm
> probably a good person to contact about this sort of thing).

Thanks, I watched your recent talk :-)

> The sample rate and number of channels need to exactly match the device
> for ideal output.

Right, but I think it should not be a problem if it is different
(subject to some quality loss) because of the new audio system?
> 
> The ffmpeg OSS code looks very primitive. I might be persuaded to write
> a backend that does detection of device characteristics. It's basically
> required for proper audio recording on NetBSD. Note that our OSS emulation
> doesn't match the spec exactly, and is also undocumented, so doing anything
> non-trivial is hard. I don't recommend writing new code that uses it for
> that reason.

You are correct. I threw together a NetBSD audio driver based on the oss
driver, but it had exactly the same problem. Strangely, I have been unable to
reproduce the problem on an old i386 netbook (so far).

I wrote a test program to try and reproduce what ffmpeg is doing, and
(I am not sure yet) it seems like non-blocking reads is causing the
distortion. The same test program with blocking reads seems to work
okay.

I will look into it a bit more, and then report back.

> Out of curiosity, does Audacity work for you (when set to single channel
> 16-bit PCM, etc - it defaults to 32-bit floats which won't work).

I am building it ... my computer is quite old! I will report back
once I have tried it out.

-- 
Kind regards,

Yorick Hardy

Audio recording (using ossaudio)

2020-03-10 Thread Yorick Hardy

Dear current-users,

Can anyone else record audio correctly via ossaudio?
audiorecord seems to work as long as the frequency
divides the native frequency (see dmesg excerpt below)

  audiorecord -d /dev/audio -c 1 -s 48000 -e slinear_le -P 16 /tmp/test.wav

seems to work fine, and

  audiorecord -d /dev/audio -c 1 -s 44100 -e slinear_le -P 16 /tmp/test.wav

is a little noisy (maybe due to sample rate conversion?).

But

 ffmpeg4 -f oss -i /dev/audio -channels 1 -sample_rate 48000 /tmp/test.wav

is completely garbled and too short. The file also seems to be 2-channel,
so I think the recording settings are somehow not applied correctly.

On the other hand,

 gst-launch-1.0 osssrc \! audio/x-raw,channels=1,format=S16LE,rate=48000 \! 
wavenc \! filesink location=/tmp/test.wav

works fine (I am not sure what it does differently.

Apologies for not finding the cause of the issue, I have not had time to
investigate. I need ffmpeg recording to work in case I have to make videos
for my classes :-)

-- 
Kind regards,

Yorick Hardy

Re: Weird qemu-nvmm problem

2020-03-10 Thread Yorick Hardy

On 2020-03-09, Chavdar Ivanov wrote:
> On Mon, 9 Mar 2020 at 19:52, Yorick Hardy  wrote:
> >
> > Dear Chavdar,
> >
> > On 2020-03-08, Chavdar Ivanov wrote:
> > > Hi,
> > >
> > > On a -current (from today, but has happened before), when running a
> > > particular nvmm guest (32-bit Windows 10), usually when it is busy
> > > going through some updates, the host gets into a weird state. I have
> > > an ssh connection to it with several tmux panes open; I can switch
> > > between them, so the connection to the host is still ok, but in the
> > > same time the host does not answer to pings anymore; none of the tmux
> > > panes themselves accepts any input, with the exception of the one I am
> > > running the qemu client in; I can interrupt it and after that the host
> > > comes into normal state. When this happens, I get
> > >
> > > [  7444.602404] coretemp3: workqueue busy: updates stopped
> > > [  7474.614306] coretemp0: workqueue busy: updates stopped
> > > [  7474.614306] coretemp1: workqueue busy: updates stopped
> > > [  7474.614306] coretemp2: workqueue busy: updates stopped
> > > [ 23591.005414] acpitz4: workqueue busy: updates stopped
> > > [ 23591.005414] acpibat1: workqueue busy: updates stopped
> > >
> > > The machine is not doing anything else at the moment, the temperatures
> > > are within the expected range.
> > >
> > > Any clues?
> > >
> > > Chavdar
> >
> > Unfortunately, "me too". But I did not manage to see the logs or
> > track down the change which caused this behaviour (sorry).
> 
> At least I can see it is not something specific to my host and installation.
> 
> >
> > I had panics for a while (in January?) when using nvmm, and once
> > that was fixed the "stalling" behaviour started.
> 
> BTW, the hangs happen with my Windows guests; I tried today a couple
> of Linux ones, they ran without a problem.
> 
> I just tried also an OmniOS guest; this one used to work fine on the
> 29th of February, when I booted and updated itl today it did not
> complete the boot at all and I had to restart the host as it was
> unresponsive.

That was my experience too (Windows guests hang, but linux guests
do not seem to hang).

-- 
Kind regards,

Yorick Hardy

Re: Weird qemu-nvmm problem

2020-03-09 Thread Yorick Hardy

Dear Chavdar,

On 2020-03-08, Chavdar Ivanov wrote:
> Hi,
> 
> On a -current (from today, but has happened before), when running a
> particular nvmm guest (32-bit Windows 10), usually when it is busy
> going through some updates, the host gets into a weird state. I have
> an ssh connection to it with several tmux panes open; I can switch
> between them, so the connection to the host is still ok, but in the
> same time the host does not answer to pings anymore; none of the tmux
> panes themselves accepts any input, with the exception of the one I am
> running the qemu client in; I can interrupt it and after that the host
> comes into normal state. When this happens, I get
> 
> [  7444.602404] coretemp3: workqueue busy: updates stopped
> [  7474.614306] coretemp0: workqueue busy: updates stopped
> [  7474.614306] coretemp1: workqueue busy: updates stopped
> [  7474.614306] coretemp2: workqueue busy: updates stopped
> [ 23591.005414] acpitz4: workqueue busy: updates stopped
> [ 23591.005414] acpibat1: workqueue busy: updates stopped
> 
> The machine is not doing anything else at the moment, the temperatures
> are within the expected range.
> 
> Any clues?
> 
> Chavdar

Unfortunately, "me too". But I did not manage to see the logs or
track down the change which caused this behaviour (sorry).

I had panics for a while (in January?) when using nvmm, and once
that was fixed the "stalling" behaviour started.

-- 
Kind regards,

Yorick Hardy

Re: audio panic

2019-11-06 Thread Yorick Hardy

Dear Tetsuya,

On 2019-11-04, Yorick Hardy wrote:
> Dear Tetsuya,
> 
> On 2019-11-02, Tetsuya Isaki wrote:
> > At Sat, 26 Oct 2019 18:27:36 +0200,
> > Yorick Hardy wrote:
> > > [   166.145911] panic: kernel diagnostic assertion "ring->used + n <= 
> > > ring->capacity" failed: file "/usr/src/local/sys/dev/audio/audiodef.h", 
> > > line 406 called from audio_track_record:4518: ring->used=32256 n=32256 
> > > ring->capacity=61440
> > > [   166.145911] cpu3: Begin traceback...
> > > [   166.145911] vpanic() at netbsd:vpanic+0x178
> > > [   166.145911] kern_assert() at netbsd:kern_assert+0x48
> > > [   166.155927] audioread() at netbsd:audioread+0xb87
> > 
> > Can you reproduce this?
> > 
> > Thanks,
> > ---
> > Tetsuya Isaki 
> 
> I should be able to try get it to happen again tomorrow (I managed
> to trigger the panic on my work computer only so far). The offending
> command is:
> 
>  ffplay4 -hide_banner -showmode waves -f oss /dev/audio
> 
> (to test the microphone). I think ffmpeg was reading audio much
> slower that the driver was providing it (because of the recording
> rate mismatch in our oss which you have kindly fixed).

My attempts at reproducing this with audioio did not work. But
reverting the libossaudio fixes makes it reproducible with ffplay4
(this is again because ffplay4 reads the audio at 8000Hz instead
of 48000Hz).

I have a crash dump if that will help (custom kernel):

  Crash version 9.99.17, image version 9.99.17.
  System panicked: trap
  Backtrace from time of crash is available.
  db> crash> bt
  _KERNEL_OPT_NAGR() at 0
  ?() at b0013f65
  vpanic() at vpanic+0x181
  snprintf() at snprintf
  startlwp() at startlwp
  calltrap() at calltrap+0x11
  dofileread() at dofileread+0x8f
  sys_read() at sys_read+0x49
  syscall() at syscall+0x1d8
  --- syscall (number 3) ---
  79d6aac42b7a:

Kind regards,

-- 
Yorick Hardy

Re: audio panic

2019-11-06 Thread Yorick Hardy

Dear Tetsuya,

On 2019-11-04, Yorick Hardy wrote:
> Dear Tetsuya,
> 
> On 2019-11-02, Tetsuya Isaki wrote:
> > At Sat, 26 Oct 2019 18:27:36 +0200,
> > Yorick Hardy wrote:
> > > [   166.145911] panic: kernel diagnostic assertion "ring->used + n <= 
> > > ring->capacity" failed: file "/usr/src/local/sys/dev/audio/audiodef.h", 
> > > line 406 called from audio_track_record:4518: ring->used=32256 n=32256 
> > > ring->capacity=61440
> > > [   166.145911] cpu3: Begin traceback...
> > > [   166.145911] vpanic() at netbsd:vpanic+0x178
> > > [   166.145911] kern_assert() at netbsd:kern_assert+0x48
> > > [   166.155927] audioread() at netbsd:audioread+0xb87
> > 
> > Can you reproduce this?
> > 
> > Thanks,
> > ---
> > Tetsuya Isaki 
> 
> I should be able to try get it to happen again tomorrow (I managed
> to trigger the panic on my work computer only so far). The offending
> command is:
> 
>  ffplay4 -hide_banner -showmode waves -f oss /dev/audio
> 
> (to test the microphone). I think ffmpeg was reading audio much
> slower that the driver was providing it (because of the recording
> rate mismatch in our oss which you have kindly fixed).

I have not been able to reproduce this yet (I am using the fixed libossaudio;
I will have to try go back to see if I can trigger the crash with the unfixed
version -- I tried to write a small program to reproduce the problem but it
works without fail).

I have been reading the code around the crash a bit, and I have a question:

 https://nxr.netbsd.org/xref/src/sys/dev/audio/audiodef.h#116

 116 u_int   usrbuf_usedhigh;/* high water mark in bytes */

but usrbuf_usedhigh is used as if it is measured in frames?

 https://nxr.netbsd.org/xref/src/sys/dev/audio/audio.c#4501

 4500 count = uimin(count,
 4501 (track->usrbuf_usedhigh - usrbuf->used) / framesize);
 4502 bytes = count * framesize;
 
(and apparently also throughout the rest of the audio.c).

I wonder if this should be:

 4500 count = uimin(count,
 4501 track->usrbuf_usedhigh - usrbuf->used);

I doubt that this is the cause of the panic though. If I may guess:

 https://nxr.netbsd.org/xref/src/sys/dev/audio/audio.c#4521
 https://nxr.netbsd.org/xref/src/sys/dev/audio/audio.c#4525

 4521 bytes2 = bytes - bytes1;
 4525 auring_push(usrbuf, bytes2);
 
does not check whether bytes2 is small enough? (The panic happened when
the software was consuming audio at a much lower rate than it asked for.)

Thanks for looking at this!

-- 
Yorick Hardy

Re: audio panic

2019-11-04 Thread Yorick Hardy

Dear Tetsuya,

On 2019-11-02, Tetsuya Isaki wrote:
> At Sat, 26 Oct 2019 18:27:36 +0200,
> Yorick Hardy wrote:
> > [   166.145911] panic: kernel diagnostic assertion "ring->used + n <= 
> > ring->capacity" failed: file "/usr/src/local/sys/dev/audio/audiodef.h", 
> > line 406 called from audio_track_record:4518: ring->used=32256 n=32256 
> > ring->capacity=61440
> > [   166.145911] cpu3: Begin traceback...
> > [   166.145911] vpanic() at netbsd:vpanic+0x178
> > [   166.145911] kern_assert() at netbsd:kern_assert+0x48
> > [   166.155927] audioread() at netbsd:audioread+0xb87
> 
> Can you reproduce this?
> 
> Thanks,
> ---
> Tetsuya Isaki 

I should be able to try get it to happen again tomorrow (I managed
to trigger the panic on my work computer only so far). The offending
command is:

 ffplay4 -hide_banner -showmode waves -f oss /dev/audio

(to test the microphone). I think ffmpeg was reading audio much
slower that the driver was providing it (because of the recording
rate mismatch in our oss which you have kindly fixed).

-- 
Kind regards,

Yorick Hardy

Re: X11 - WindowMaker characters issues after update

2019-11-04 Thread Yorick Hardy

Dear Riccardo,

On 2019-11-03, Riccardo Mottola wrote:
> Hi,
> 
> I upgraded NetBSD current and then upgraded all pkgsrc packages with
> pkg_rolling-replace.
> After two attempts of prr, everything upgraded except rust (which continutes
> to hang/fail).
> 
> X11 works, but I use WindowMaker and it is unusable now (and WPrefs
> neither). All displayed characters are replaced with rectangles with small
> numbers (like when special characters cannot be displayed): everything,
> menus, window titles, etc. Unreadable.
> 
> Interestingly, other X11 apps work: both xterm, xedit work. Also SeaMonkey
> and gVim.
> 
> Below I paste ld of wmaker:
> 
> /usr/pkg/bin/wmaker:
>     -lWINGs.3 => /usr/pkg/lib/libWINGs.so.3
>     -lWUtil.5 => /usr/pkg/lib/libWUtil.so.5
>     -lkvm.6 => /usr/lib/libkvm.so.6
>     -lc.12 => /usr/lib/libc.so.12
>     -lintl.1 => /usr/lib/libintl.so.1
>     -lwraster.6 => /usr/pkg/lib/libwraster.so.6
>     -lXpm.5 => /usr/X11R7/lib/libXpm.so.5
>     -lXext.7 => /usr/X11R7/lib/libXext.so.7
>     -lX11.7 => /usr/X11R7/lib/libX11.so.7
>     -lxcb.2 => /usr/X11R7/lib/libxcb.so.2
>     -lXau.7 => /usr/X11R7/lib/libXau.so.7
>     -lXdmcp.7 => /usr/X11R7/lib/libXdmcp.so.7
>     -lpng16.16 => /usr/pkg/lib/libpng16.so.16
>     -lz.1 => /usr/lib/libz.so.1
>     -lm.0 => /usr/lib/libm.so.0
>     -lgif.7 => /usr/pkg/lib/libgif.so.7
>     -ltiff.5 => /usr/pkg/lib/libtiff.so.5
>     -llzma.2 => /usr/lib/liblzma.so.2
>     -lpthread.1 => /usr/lib/libpthread.so.1
>     -ljbig.2 => /usr/pkg/lib/libjbig.so.2
>     -ljpeg.9 => /usr/pkg/lib/libjpeg.so.9
>     -lwebp.7 => /usr/pkg/lib/libwebp.so.7
>     -lXmu.7 => /usr/X11R7/lib/libXmu.so.7
>     -lXt.7 => /usr/X11R7/lib/libXt.so.7
>     -lSM.7 => /usr/X11R7/lib/libSM.so.7
>     -lICE.7 => /usr/X11R7/lib/libICE.so.7
>     -lpangoxft-1.0.0 => /usr/pkg/lib/libpangoxft-1.0.so.0
>     -lpango-1.0.0 => /usr/pkg/lib/libpango-1.0.so.0
>     -lglib-2.0.0 => /usr/pkg/lib/libglib-2.0.so.0
>     -lpcre.1 => /usr/pkg/lib/libpcre.so.1
>     -lgobject-2.0.0 => /usr/pkg/lib/libgobject-2.0.so.0
>     -lffi.6 => /usr/pkg/lib/libffi.so.6
>     -lfribidi.0 => /usr/pkg/lib/libfribidi.so.0
>     -lharfbuzz.0 => /usr/pkg/lib/libharfbuzz.so.0
>     -lfreetype.19 => /usr/X11R7/lib/libfreetype.so.19
>     -lbz2.1 => /usr/lib/libbz2.so.1
>     -lgraphite2.3 => /usr/pkg/lib/libgraphite2.so.3
>     -lstdc++.9 => /usr/lib/libstdc++.so.9
>     -lgcc_s.1 => /usr/lib/libgcc_s.so.1
>     -lpangoft2-1.0.0 => /usr/pkg/lib/libpangoft2-1.0.so.0
>     -lfontconfig.2 => /usr/X11R7/lib/libfontconfig.so.2
>     -lexpat.2 => /usr/lib/libexpat.so.2
>     -lXrender.2 => /usr/X11R7/lib/libXrender.so.2
>     -lXft.3 => /usr/X11R7/lib/libXft.so.3
>     -lXrandr.3 => /usr/X11R7/lib/libXrandr.so.3
>     -lXinerama.2 => /usr/X11R7/lib/libXinerama.so.2

This might be more pango fallout:

 https://blogs.gnome.org/mclasen/2019/05/25/pango-future-directions/

 "Using Harfbuzz for font loading means that we will lose support
 for bitmap and type1 fonts. We think this is an acceptable trade-off,
 but others might disagree. Note that Harfbuzz does support loading
 bitmap-only OpenType fonts."

Kind regards,

-- 
Kind regards,

Yorick Hardy

audio panic

2019-10-26 Thread Yorick Hardy

Dear current-users,

When recording with a recently updated kernel, I experienced some
kernel panics.  I think this has been the case for some time now
- but I only got around to reporting the panics now (apologies).
I captured two panics in my dmesg, I guess the second is most
relevant.

I have also managed to record audio using ffmeg4 without a panic,
but the audio plays back too slow and is a bit garbled (I was unable
to diagnose the quality problems - but the sampling speed was much
higher than specified).

Any tips would be appreciated.

uname:
NetBSD HOME 9.99.17 NetBSD 9.99.17 (YORICK.amd64) #1: Sat Oct 26 00:06:47 SAST 
2019  root@HOME:/root/build.amd64.local/obj/sys/arch/amd64/compile/YORICK.amd64 
amd64

dmesg:
...
[ 1.052454] hdaudio0 at pci0 dev 31 function 3: HD Audio Controller
[ 1.052454] hdaudio0: interrupting at msi3 vec 0
[ 1.052454] hdafg0 at hdaudio0: vendor 14f1 product 50f4
[ 1.052454] hdafg0: DAC00 2ch: Speaker [Built-In]
[ 1.052454] hdafg0: ADC01 2ch: Mic In [Built-In]
[ 1.052454] hdafg0: DAC03 2ch: HP Out [Jack]
[ 1.052454] hdafg0: 2ch/2ch 48000Hz 96000Hz PCM16 PCM24 AC3
[ 1.052454] audio0 at hdafg0: playback, capture, full duplex, independent
[ 1.052454] audio0: slinear_le:16 2ch 48000Hz, blk 40ms for playback
[ 1.052454] audio0: slinear_le:16 2ch 48000Hz, blk 40ms for recording
[ 1.052454] spkr0 at audio0: PC Speaker (synthesized)
[ 1.052454] wsbell at spkr0 not configured
[ 1.052454] hdafg1 at hdaudio0: vendor 8086 product 2809
[ 1.052454] hdafg1: DP00 8ch: Digital Out [Jack]
[ 1.052454] hdafg1: 8ch/0ch 48000Hz PCM16*
...
[   253.893572] uvm_fault(0x810c3160, 0xc8002035a000, 1) -> e
[   253.893572] fatal page fault in supervisor mode
[   253.893572] trap type 6 code 0 rip 0x80b090df cs 0x8 rflags 0x10246 
cr2 0xc8002035a000 ilevel 0 rsp 0xc8013f042e78
[   253.893572] curlwp 0x80a4e46150e0 pid 2801.4 lowest kstack 
0xc8013f0402c0
[   253.893572] panic: trap
[   253.893572] cpu1: Begin traceback...
[   253.893572] vpanic() at netbsd:vpanic+0x178
[   253.893572] snprintf() at netbsd:snprintf
[   253.893572] startlwp() at netbsd:startlwp
[   253.893572] alltraps() at netbsd:alltraps+0xbb
[   253.893572] dofileread() at netbsd:dofileread+0x8f
[   253.903584] sys_read() at netbsd:sys_read+0x49
[   253.903584] syscall() at netbsd:syscall+0x1d8
[   253.903584] --- syscall (number 3) ---
[   253.903584] 7e3IV, PNP0C14-2): ACPI WMI Interface
...
[   166.145911] panic: kernel diagnostic assertion "ring->used + n <= 
ring->capacity" failed: file "/usr/src/local/sys/dev/audio/audiodef.h", line 
406 called from audio_track_record:4518: ring->used=32256 n=32256 
ring->capacity=61440
[   166.145911] cpu3: Begin traceback...
[   166.145911] vpanic() at netbsd:vpanic+0x178
[   166.145911] kern_assert() at netbsd:kern_assert+0x48
[   166.155927] audioread() at netbsd:audioread+0xb87
[   166.155927] dofileread() at netbsd:dofileread+0x8f
[   166.155927] sys_read() at netbsd:sys_read+0x49
[   166.155927] syscall() at netbsd:syscall+0x211
[   166.155927] --- syscall (number 3) ---
[   166.155927] 7047c3c42b7a:
[   166.155927] cpu3: End traceback...

-- 
Kind regards,

Yorick Hardy

Re: Mesa update

2019-04-22 Thread Yorick Hardy

On 2019-04-18, co...@sdf.org wrote:
> LD_PRELOAD=/usr/lib/libpthread.so fixes it.
> It's the libc stubs. I don't know what to link against libpthread but
> pkgsrc is cheating by having glmark2 linked with libpthread.

Thank you for updating Mesa, it is much appreciated. SDL2 applications
which load libGL.so seem to fail due to missing symbols, I had to add
some dependencies as below. Is this correct?

-- 
Kind regards,

Yorick Hardy

Index: external/mit/xorg/lib/Makefile
===
RCS file: /cvsroot/src/external/mit/xorg/lib/Makefile,v
retrieving revision 1.49
diff -u -u -r1.49 Makefile
--- external/mit/xorg/lib/Makefile  16 Apr 2019 21:20:51 -  1.49
+++ external/mit/xorg/lib/Makefile  22 Apr 2019 06:43:24 -
@@ -20,10 +20,12 @@
 .endif
 SUBDIR+=libxcb \
.WAIT
+SUBDIR+=libX11 \
+   .WAIT
 .if !defined(MLIBDIR)
 SUBDIR+=${EXTRA_DRI_DIRS} dri${OLD_PREFIX} gallium${OLD_PREFIX}
 .endif
-SUBDIR+=fontconfig libSM libX11 \
+SUBDIR+=fontconfig libSM \
.WAIT \
libXcomposite libXdamage libXext libXfixes libXt \
libxkbfile libepoxy \
Index: external/mit/xorg/lib/gallium/Makefile
===
RCS file: /cvsroot/src/external/mit/xorg/lib/gallium/Makefile,v
retrieving revision 1.25
diff -u -u -r1.25 Makefile
--- external/mit/xorg/lib/gallium/Makefile  16 Apr 2019 17:29:09 -  
1.25
+++ external/mit/xorg/lib/gallium/Makefile  22 Apr 2019 06:43:24 -
@@ -957,6 +957,9 @@
 LIBDPLIBS+=terminfo${.CURDIR}/../../../../../lib/libterminfo
 LIBDPLIBS+=z   ${.CURDIR}/../../../../../lib/libz
 LIBDPLIBS+=execinfo${.CURDIR}/../../../../../lib/libexecinfo
+LIBDPLIBS+=xcb ${.CURDIR}/../libxcb/libxcb
+LIBDPLIBS+=xcb-dri2${.CURDIR}/../libxcb/dri2
+LIBDPLIBS+=X11-xcb ${.CURDIR}/../libX11/libX11-xcb
 
 # gallium drivers requiring LLVM
 .if ${BUILD_LLVMPIPE} == 1 || ${BUILD_RADEON} == 1

Re: CVS commit: src/sys/external/bsd/drm2

2017-03-01 Thread Yorick Hardy

On 2017-03-01, Martin Husemann wrote:
> On Wed, Mar 01, 2017 at 06:55:24PM +0900, Kimihiro Nonaka wrote:
> > Updated the patch.
> 
> Still works fine for me!
> 
> Martin

Also works fine for me on i386+intel and amd64+radeon.

Thanks!

-- 
Kind regards,

Yorick Hardy

Re: CVS commit: src/sys/external/bsd/drm2

2017-02-28 Thread Yorick Hardy

On 2017-02-28, Kimihiro Nonaka wrote:
> Hi,
> 
> 2017-02-28 4:10 GMT+09:00 Martin Husemann <mar...@duskware.de>:
> 
> > On Mon, Feb 27, 2017 at 07:39:49PM +0100, Martin Husemann wrote:
> >> On Mon, Feb 27, 2017 at 08:29:10PM +0200, Yorick Hardy wrote:
> >> > Is anyone else experiencing GPU hangs since revision 1.14
> >> > of src/sys/external/bsd/drm2/pci/drm_pci.c ?
> >>
> >> Thanks for the hint, I'm testing if it is the cause for
> >>
> >> http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=51997
> >>
> >> and will report back ASAP...
> >
> > Yes, that seems to be the case.
> 
> I've reverted.

Thank you!

It is not obvious to me why the MSI changes were problematic.
Do you have any ideas what went wrong?

-- 
Kind regards,

Yorick Hardy

Re: CVS commit: src/sys/external/bsd/drm2

2017-02-27 Thread Yorick Hardy

ndor 8086 product 27c5 (rev. 0x02)
ahcisata0: interrupting at ioapic0 pin 17
ahcisata0: AHCI revision 1.10, 4 ports, 32 slots, CAP 
0xdf10ff03<PSC,SSC,PMD,ISS=0x1=Gen1,SCLO,SAL,SALP,SSS,SMPS,SNCQ,S64A>
atabus1 at ahcisata0 channel 0
atabus2 at ahcisata0 channel 2
ichsmb0 at pci0 dev 31 function 3: vendor 8086 product 27da (rev. 0x02)
ichsmb0: interrupting at ioapic0 pin 17
iic0 at ichsmb0: I2C bus
isa0 at ichlpcib0
acpicpu0 at cpu0: ACPI CPU
acpicpu0: C1: FFH, lat   1 us, pow  1000 mW
acpicpu0: C2: I/O, lat   1 us, pow   500 mW
acpicpu0: C3: I/O, lat  57 us, pow   100 mW
acpicpu0: P0: FFH, lat  10 us, pow  2000 mW, 1600 MHz
acpicpu0: P1: FFH, lat  10 us, pow  1533 mW, 1333 MHz
acpicpu0: P2: FFH, lat  10 us, pow  1066 mW, 1066 MHz
acpicpu0: P3: FFH, lat  10 us, pow   600 mW,  800 MHz
acpicpu0: T0: I/O, lat   1 us, pow 0 mW, 100 %
acpicpu0: T1: I/O, lat   1 us, pow 0 mW,  88 %
acpicpu0: T2: I/O, lat   1 us, pow 0 mW,  76 %
acpicpu0: T3: I/O, lat   1 us, pow 0 mW,  64 %
acpicpu0: T4: I/O, lat   1 us, pow 0 mW,  52 %
acpicpu0: T5: I/O, lat   1 us, pow 0 mW,  40 %
acpicpu0: T6: I/O, lat   1 us, pow 0 mW,  28 %
acpicpu0: T7: I/O, lat   1 us, pow 0 mW,  16 %
coretemp0 at cpu0: thermal sensor, 1 C resolution, Tjmax=100
acpicpu1 at cpu1: ACPI CPU
DRM error in i915_irq_handler: pipe A underrun
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
acpiacad0: AC adapter offline.
IPsec: Initialized Security Association Processing.
uhub0 at usb0: vendor 8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub1 at usb1: vendor 8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhub2 at usb2: vendor 8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhub3 at usb3: vendor 8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
uhub4 at usb4: vendor 8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub4: 8 ports with 8 removable, self powered
ahcisata0 port 0: device present, speed: 1.5Gb/s
wd0 at atabus1 drive 0
wd0: 
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 232 GB, 484521 cyl, 16 head, 63 sec, 512 bytes/sect x 488397168 sectors
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd0(ahcisata0:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100) 
(using DMA)
uvideo0 at uhub4 port 5 configuration 1 interface 0: BISON Corporation LG, rev 
2.00/18.02, addr 2
video0 at uvideo0: BISON Corporation LG, rev 2.00/18.02, addr 2
WARNING: 2 errors while detecting hardware; check system log.
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
kern.module.path=/stand/i386/7.99.62/modules
ral0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
ral0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 
36Mbps 48Mbps 54Mbps
ubt0 at uhub1 port 1
ubt0: Broadcom Corp Broadcom Bluetooth 2.1 Device, rev 2.00/0.72, addr 2
wsdisplay0: screen 1 added (default, vt100 emulation)
wsdisplay0: screen 2 added (default, vt100 emulation)
wsdisplay0: screen 3 added (default, vt100 emulation)
wsdisplay0: screen 4 added (default, vt100 emulation)

-- 
Kind regards,

Yorick Hardy

Re: XOrg oriented build failure

2017-01-05 Thread Yorick Hardy

On 2017-01-04, bch wrote:
> On Jan 4, 2017 21:37, "Martin Husemann" <mar...@duskware.de> wrote:
> 
> On Wed, Jan 04, 2017 at 09:29:05PM -0800, bch wrote:
> > transform.o: In function `map_to_output':
> > transform.c:(.text+0x2a3): undefined reference to `xi2_find_device_info'
> > collect2: error: ld returned 1 exit status
> 
> Clean the xinput obj dir and retry (this needs another UPDATING entry).
> 
> 
> I tried that (given our exchange earlier today), but I'll try again in case
> some cruft was missed.
> 
> Cheers,

That symbol is defined in xsrc/external/mit/xinput/dist/src/xinput.c, but is
protected by a "#ifdef HAVE_XI2".

Did you clear out obj/external/mit/xorg/bin/xinput?
(It seems yes from your answer above.)

If it is broken, then I am responsible and will try to fix it.

-- 
Kind regards,

Yorick Hardy

Illegal instruction in libcrypto.so

2017-01-04 Thread Yorick Hardy

Dear all,

I frequently encounter an illegal instruction when using SSL with python2.7 and 
wip/rawdog.

  $ uname -a
  NetBSD HOME 7.99.53 NetBSD 7.99.53 (YORICK.amd64) #0: Sun Jan  1 16:42:47 
SAST 2017  
root@HOME:/root/build.amd64.local/obj/sys/arch/amd64/compile/YORICK.amd64 amd64

It seems OpenSSL is using instructions which are not available on my CPU
(no known problems on 7-STABLE).

The last few entries of the the backtrace are:

  Core was generated by `python2.7'.
  Program terminated with signal SIGILL, Illegal instruction.
  #0  0x71df7e9735ca in bn_GF2m_mul_2x2 () from /usr/lib/libcrypto.so.12
  [Current thread is 1 (LWP 1)]
  (gdb) bt
  #0  0x71df7e9735ca in bn_GF2m_mul_2x2 () from /usr/lib/libcrypto.so.12
  #1  0x71df7e96ebb7 in BN_GF2m_mod_mul_arr () from /usr/lib/libcrypto.so.12
  #2  0x71df7e968cc9 in ec_GF2m_simple_is_on_curve () from 
/usr/lib/libcrypto.so.12
  #3  0x71df7e927733 in ec_GF2m_simple_oct2point () from 
/usr/lib/libcrypto.so.12
  #4  0x71df7ee45813 in ssl3_get_key_exchange () from /usr/lib/libssl.so.12
  #5  0x71df7ee46789 in ssl3_connect () from /usr/lib/libssl.so.12
  #6  0x71df7cc0bdbe in PySSL_SSLdo_handshake () from 
/usr/pkg/lib/python2.7/lib-dynload/_ssl.so
  #7  0x71df842d39bd in PyEval_EvalFrameEx () from 
/usr/pkg/lib/libpython2.7.so.1.0

  Dump of assembler code for function bn_GF2m_mul_2x2:
 0x71df7e9735a0 <+0>: lea0x2b30f5(%rip),%rax# 
0x71df7ec2669c 
 0x71df7e9735a7 <+7>: bt $0x21,%rax
 0x71df7e9735ac <+12>:jae0x71df7e973610 <bn_GF2m_mul_2x2+112>
 0x71df7e9735ae <+14>:movq   %rsi,%xmm0
 0x71df7e9735b3 <+19>:movq   %rcx,%xmm1
 0x71df7e9735b8 <+24>:movq   %rdx,%xmm2
 0x71df7e9735bd <+29>:movq   %r8,%xmm3
 0x71df7e9735c2 <+34>:movdqa %xmm0,%xmm4
 0x71df7e9735c6 <+38>:movdqa %xmm1,%xmm5
  => 0x71df7e9735ca <+42>:pclmullqlqdq %xmm1,%xmm0
 0x71df7e9735d0 <+48>:pxor   %xmm2,%xmm4
 0x71df7e9735d4 <+52>:pxor   %xmm3,%xmm5
 0x71df7e9735d8 <+56>:pclmullqlqdq %xmm3,%xmm2
 0x71df7e9735de <+62>:pclmullqlqdq %xmm5,%xmm4

The CPU features clearly omit PCLMULQDQ:

  $ cpuctl identify 0
  cpu0: highest basic info 0005
  cpu0: highest extended info 801b
  cpu0: "AMD Athlon(tm) II X3 450 Processor"
  cpu0: AMD Family 10h (686-class), 3200.27 MHz
  cpu0: family 0x10 model 0x5 stepping 0x3 (id 0x100f53)
  cpu0: features 
0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE>
  cpu0: features 0x178bfbff<MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT>
  cpu0: features1 0x802009<SSE3,MONITOR,CX16,POPCNT>
  cpu0: features2 0xefd3fbff
  cpu0: features2 0xefd3fbff<3DNOW2,3DNOW>
  cpu0: features3 0x37ff<LAHF,CMPLEGACY,SVM,EAPIC,ALTMOVCR0,LZCNT,SSE4A>
  cpu0: features3 0x37ff<MISALIGNSSE,3DNOWPREFETCH,OSVW,IBS,SKINIT,WDT>
  cpu0: I-cache 64KB 64B/line 2-way, D-cache 64KB 64B/line 2-way
  cpu0: L2 cache 512KB 64B/line 16-way
  cpu0: ITLB 32 4KB entries fully associative, 16 2MB entries fully associative
  cpu0: DTLB 48 4KB entries fully associative, 48 2MB entries fully associative
  cpu0: L2 ITLB 512 4KB entries 4-way
  cpu0: L2 DTLB 512 4KB entries 4-way, 128 2MB entries 2-way
  cpu0: L1 1GB page DTLB 48 1GB entries fully associative
  cpu0: L2 1GB page DTLB 16 1GB entries 8-way
  cpu0: Initial APIC ID 0
  cpu0: AMD Power Management features: 0x1f9<TS,TTP,HTC,STC,100,HWP,TSC>
  cpu0: SVM Rev. 1
  cpu0: SVM NASID 64
  cpu0: SVM features 0xf<NP,LbrVirt,SVML,NRIPS>
  cpu0: UCode version: 0x1c8


A wild guess is that this change is involved

 
http://cvsweb.netbsd.org/bsdweb.cgi/src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gf2m.pl.diff?r1=1.3=1.4

but I don't understand the change.

Any ideas what could be wrong?

-- 
Kind regards,

Yorick Hardy

Re: Fix for kern/51772 breaks linking multi-config kernels?

2017-01-04 Thread Yorick Hardy

Dear Christos,

On 2017-01-04, Christos Zoulas wrote:
> In article <20170104195823.GD26839@HOME>,
> Yorick Hardy  <yorickha...@gmail.com> wrote:
> >Dear Martin,
> >
> >On 2017-01-04, Martin Husemann wrote:
> >> Can't you just use swap${.TARGET}.c instead of the wildcard?
> >> 
> >> Martin
> >
> >I don't think so, because then we pickup more than one swap*.o
> >when linking (and redefinition of symbols).
> >
> >Or did I misunderstand? (I assumed you meant swap${.TARGET}.o).
> >
> >I think I have a working patch, but I think we can do better (i.e. less
> >assumptions about filenames):
> >
> >Index: sys/conf/Makefile.kern.inc
> >===
> >RCS file: /cvsroot/src/sys/conf/Makefile.kern.inc,v
> >retrieving revision 1.251
> >diff -u -r1.251 Makefile.kern.inc

[snip]

> I thought we wanted to match the M and N modifiers so we select and deselect
> the same files?

That was my thought too! But when multiple kernels are configured we get 
multiple
swap*.o files with duplicate symbols.

How about the following? This patch removes all the wildcards, removes the 
swap*.o
files for all configured kernels and includes the swap*.o file for the current 
target.
Did I miss anything?

-- 
Kind regards,

Yorick Hardy

Index: sys/conf/Makefile.kern.inc
===
RCS file: /cvsroot/src/sys/conf/Makefile.kern.inc,v
retrieving revision 1.252
diff -u -r1.252 Makefile.kern.inc
--- sys/conf/Makefile.kern.inc  4 Jan 2017 19:55:06 -   1.252
+++ sys/conf/Makefile.kern.inc  4 Jan 2017 20:26:50 -
@@ -203,6 +203,10 @@
 
 SYSTEM_LIB=${MD_LIBS} ${SYSLIBCOMPAT} ${LIBKERN}
 SYSTEM_OBJ?=   ${_MD_OBJS} ${OBJS} ${SYSTEM_LIB}
+REMOVE_SWAP=   [@]
+.for k in ${KERNELS}
+REMOVE_SWAP:=  ${REMOVE_SWAP}:Nswap${k}.o
+.endfor
 SYSTEM_DEP+=   Makefile ${SYSTEM_OBJ:O}
 .if defined(CTFMERGE)
 SYSTEM_CTFMERGE= ${CTFMERGE} ${CTFMFLAGS} -o ${.TARGET} ${SYSTEM_OBJ} 
${EXTRA_OBJ} vers.o
@@ -213,11 +217,11 @@
 SYSTEM_LD?=${_MKSHMSG} "   link  ${.CURDIR:T}/${.TARGET}"; \
${_MKSHECHO}\
${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \
-   '$${SYSTEM_OBJ:N*swap${.TARGET}.o}' '$${EXTRA_OBJ}' vers.o \
-   ${OBJS:M*swap${.TARGET}.o}; \
+   '$${SYSTEM_OBJ:${REMOVE_SWAP}}' '$${EXTRA_OBJ}' vers.o \
+   ${OBJS:Mswap${.TARGET}.o}; \
${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \
-   ${SYSTEM_OBJ:N*swap${.TARGET}.o} ${EXTRA_OBJ} vers.o \
-   ${OBJS:M*swap${.TARGET}.o}
+   ${SYSTEM_OBJ:${REMOVE_SWAP}} ${EXTRA_OBJ} vers.o \
+   ${OBJS:Mswap${.TARGET}.o}
 
 TEXTADDR?= ${LOADADDRESS}  # backwards compatibility
 LINKTEXT?= ${TEXTADDR:C/.+/-Ttext &/}

Re: Fix for kern/51772 breaks linking multi-config kernels?

2017-01-04 Thread Yorick Hardy

Dear Martin,

On 2017-01-04, Martin Husemann wrote:
> Can't you just use swap${.TARGET}.c instead of the wildcard?
> 
> Martin

I don't think so, because then we pickup more than one swap*.o
when linking (and redefinition of symbols).

Or did I misunderstand? (I assumed you meant swap${.TARGET}.o).

I think I have a working patch, but I think we can do better (i.e. less
assumptions about filenames):

Index: sys/conf/Makefile.kern.inc
===
RCS file: /cvsroot/src/sys/conf/Makefile.kern.inc,v
retrieving revision 1.251
diff -u -r1.251 Makefile.kern.inc
--- sys/conf/Makefile.kern.inc  4 Jan 2017 15:43:04 -   1.251
+++ sys/conf/Makefile.kern.inc  4 Jan 2017 18:12:27 -
@@ -213,10 +213,10 @@
 SYSTEM_LD?=${_MKSHMSG} "   link  ${.CURDIR:T}/${.TARGET}"; \
${_MKSHECHO}\
${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \
-   '$${SYSTEM_OBJ:N*swap*${.TARGET}*}' '$${EXTRA_OBJ}' vers.o \
+   '$${SYSTEM_OBJ:Nswap*}' '$${EXTRA_OBJ}' vers.o \
${OBJS:M*swap${.TARGET}.o}; \
${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \
-   ${SYSTEM_OBJ:N*swap*${.TARGET}*} ${EXTRA_OBJ} vers.o \
+   ${SYSTEM_OBJ:Nswap*} ${EXTRA_OBJ} vers.o \
${OBJS:M*swap${.TARGET}.o}
 
 TEXTADDR?= ${LOADADDRESS}  # backwards compatibility

-- 
Kind regards,

Yorick Hardy

Re: Fix for kern/51772 breaks linking multi-config kernels?

2017-01-04 Thread Yorick Hardy

On 2017-01-04, Yorick Hardy wrote:
> On 2017-01-04, Yorick Hardy wrote:
> > Dear Martin,
> > 
> > On 2017-01-04, Martin Husemann wrote:
> > > On Wed, Jan 04, 2017 at 07:28:19PM +0200, Yorick Hardy wrote:
> > > > Apologies, my "fix" broke your build. I wonder why it worked before,
> > > > probably becuase you have "netbsd" as part of your kernel name?
> > > > 
> > > > Maybe the change should be reverted until the correct solution is found.
> > > 
> > > Just a side note: we have several evb* configs that use multiple config
> > > statements, for example sys/arch/evbmips/conf/ZYXELKX:
> > > 
> > > config  netbsd root on ? type ?
> > > config  netbsd-sd0a root on sd0a type ffs dumps none
> > > config  netbsd-reth0 root on reth0 type nfs dumps none
> > > 
> > > 
> > > Martin
> > 
> > I am to blame for testing only a very simple configuration!
> > 
> > But it still seems wrong...
> > 
> > Is it correct that ${SYSTEM_OBJ:N*swap*netbsd*} is only for
> > configurations named "netbsd*" - are other names allowed?
> > 
> > Maybe we should just use ${SYSTEM_OBJ:N*swap*}, but I still
> > need to check whether any other object files match this pattern.
> 
> The suggested pattern will not work.
> 
> Perhaps I am the only person with a kernel not named "netbsd*"!
> I suggest reverting the change until we find a solution, with my
> apologies.

I am testing the patch below, it worked for my kernel which is not named
"netbsd*"; I am now testing ZYXELKX. My machine is not the fastest, so it
will take a while.

Are there any other possible object files beginning with swap* ?

-- 
Kind regards,

Yorick Hardy

Index: sys/conf/Makefile.kern.inc
===
RCS file: /cvsroot/src/sys/conf/Makefile.kern.inc,v
retrieving revision 1.251
diff -u -r1.251 Makefile.kern.inc
--- sys/conf/Makefile.kern.inc  4 Jan 2017 15:43:04 -   1.251
+++ sys/conf/Makefile.kern.inc  4 Jan 2017 18:12:27 -
@@ -213,10 +213,10 @@
 SYSTEM_LD?=${_MKSHMSG} "   link  ${.CURDIR:T}/${.TARGET}"; \
${_MKSHECHO}\
${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \
-   '$${SYSTEM_OBJ:N*swap*${.TARGET}*}' '$${EXTRA_OBJ}' vers.o \
+   '$${SYSTEM_OBJ:Nswap*}' '$${EXTRA_OBJ}' vers.o \
${OBJS:M*swap${.TARGET}.o}; \
${LD} -Map ${.TARGET}.map --cref ${LINKFLAGS} -o ${.TARGET} \
-   ${SYSTEM_OBJ:N*swap*${.TARGET}*} ${EXTRA_OBJ} vers.o \
+   ${SYSTEM_OBJ:Nswap*} ${EXTRA_OBJ} vers.o \
${OBJS:M*swap${.TARGET}.o}
 
 TEXTADDR?= ${LOADADDRESS}  # backwards compatibility

Re: Fix for kern/51772 breaks linking multi-config kernels?

2017-01-04 Thread Yorick Hardy

On 2017-01-04, Yorick Hardy wrote:
> Dear Martin,
> 
> On 2017-01-04, Martin Husemann wrote:
> > On Wed, Jan 04, 2017 at 07:28:19PM +0200, Yorick Hardy wrote:
> > > Apologies, my "fix" broke your build. I wonder why it worked before,
> > > probably becuase you have "netbsd" as part of your kernel name?
> > > 
> > > Maybe the change should be reverted until the correct solution is found.
> > 
> > Just a side note: we have several evb* configs that use multiple config
> > statements, for example sys/arch/evbmips/conf/ZYXELKX:
> > 
> > config  netbsd root on ? type ?
> > config  netbsd-sd0a root on sd0a type ffs dumps none
> > config  netbsd-reth0 root on reth0 type nfs dumps none
> > 
> > 
> > Martin
> 
> I am to blame for testing only a very simple configuration!
> 
> But it still seems wrong...
> 
> Is it correct that ${SYSTEM_OBJ:N*swap*netbsd*} is only for
> configurations named "netbsd*" - are other names allowed?
> 
> Maybe we should just use ${SYSTEM_OBJ:N*swap*}, but I still
> need to check whether any other object files match this pattern.

The suggested pattern will not work.

Perhaps I am the only person with a kernel not named "netbsd*"!
I suggest reverting the change until we find a solution, with my
apologies.

-- 
Kind regards,

Yorick Hardy

Re: Fix for kern/51772 breaks linking multi-config kernels?

2017-01-04 Thread Yorick Hardy

Dear Martin,

On 2017-01-04, Martin Husemann wrote:
> On Wed, Jan 04, 2017 at 07:28:19PM +0200, Yorick Hardy wrote:
> > Apologies, my "fix" broke your build. I wonder why it worked before,
> > probably becuase you have "netbsd" as part of your kernel name?
> > 
> > Maybe the change should be reverted until the correct solution is found.
> 
> Just a side note: we have several evb* configs that use multiple config
> statements, for example sys/arch/evbmips/conf/ZYXELKX:
> 
> config  netbsd root on ? type ?
> config  netbsd-sd0a root on sd0a type ffs dumps none
> config  netbsd-reth0 root on reth0 type nfs dumps none
> 
> 
> Martin

I am to blame for testing only a very simple configuration!

But it still seems wrong...

Is it correct that ${SYSTEM_OBJ:N*swap*netbsd*} is only for
configurations named "netbsd*" - are other names allowed?

Maybe we should just use ${SYSTEM_OBJ:N*swap*}, but I still
need to check whether any other object files match this pattern.

-- 
Kind regards,

Yorick Hardy

Re: Fix for kern/51772 breaks linking multi-config kernels?

2017-01-04 Thread Yorick Hardy

Dear John,

On 2017-01-04, John D. Baker wrote:
> Since this commit:
> 
>   http://mail-index.netbsd.org/source-changes/2017/01/04/msg080495.html
> 
> My custom kernel with multiple "config" statments:
> 
> include "arch/evbmips/conf/LOONGSON"
> [...]
> no config   netbsd
> config  netbsd_nfs  root on ? type nfs dumps on wd0j
> config  netbsd_sd0  root on sd0a type ffs dumps on wd0j
> config  netbsd_sd1  root on sd1a type ffs
> [...]
> 
> fails linking with:
> 
> [...]
> #  link  YEELOONG/netbsd_nfs
> /d0/build/current/tools/amd64/bin/mips64el--netbsd-ld -Map netbsd_nfs.map 
> --cref -m elf64ltsmip -T netbsd_nfs.ldscript -Ttext 0x8020 -e 
> start -G 0 -X -o netbsd_nfs ${SYSTEM_OBJ:N*swap*netbsd_nfs*} ${EXTRA_OBJ} 
> vers.o swapnetbsd_nfs.o
> swapnetbsd_sd1.o:(.data+0x0): multiple definition of `rootfstype'
> swapnetbsd_sd0.o:(.data+0x0): first defined here
> swapnetbsd_sd1.o:(.data+0x8): multiple definition of `dumpdev'
> swapnetbsd_sd0.o:(.data+0x8): first defined here
> swapnetbsd_sd1.o:(.data+0x10): multiple definition of `dumpspec'
> swapnetbsd_sd0.o:(.data+0x10): first defined here
> swapnetbsd_sd1.o:(.data+0x18): multiple definition of `rootdev'
> swapnetbsd_sd0.o:(.data+0x18): first defined here
> swapnetbsd_sd1.o:(.data+0x20): multiple definition of `rootspec'
> swapnetbsd_sd0.o:(.data+0x20): first defined here
> swapnetbsd_nfs.o:(.data+0x0): multiple definition of `rootfstype'
> swapnetbsd_sd0.o:(.data+0x0): first defined here
> swapnetbsd_nfs.o:(.data+0x8): multiple definition of `dumpdev'
> swapnetbsd_sd0.o:(.data+0x8): first defined here
> swapnetbsd_nfs.o:(.data+0x10): multiple definition of `dumpspec'
> swapnetbsd_sd0.o:(.data+0x10): first defined here
> swapnetbsd_nfs.o:(.data+0x18): multiple definition of `rootdev'
> swapnetbsd_sd0.o:(.data+0x18): first defined here
> swapnetbsd_nfs.o:(.data+0x20): multiple definition of `rootspec'
> swapnetbsd_sd0.o:(.data+0x20): first defined here
> /d0/build/current/tools/amd64/bin/mips64el--netbsd-ld: Warning: netbsd_nfs 
> uses -msoft-float (set by locore.o), mips_fpu.o uses -mhard-float
> /d0/build/current/tools/amd64/bin/mips64el--netbsd-ld: Warning: netbsd_nfs 
> uses -msoft-float (set by locore.o), fp.o uses -mhard-float
> *** [netbsd_nfs] Error code 1
> 
> nbmake: stopped in 
> /d0/build/current/obj/mips64el/sys/arch/evbmips/compile/YEELOONG
> 1 error
> 
> nbmake: stopped in 
> /d0/build/current/obj/mips64el/sys/arch/evbmips/compile/YEELOONG
> 
> ERROR: Failed to make all in 
> "/d0/build/current/obj/mips64el/sys/arch/evbmips/compile/YEELOONG"
> *** BUILD ABORTED ***
> 
> 
> Perhaps this is an update-build issue?  I'll wipe my ".../compile"
> directory and try again.

Apologies, my "fix" broke your build. I wonder why it worked before,
probably becuase you have "netbsd" as part of your kernel name?

Maybe the change should be reverted until the correct solution is found.

-- 
Kind regards,

Yorick Hardy

52 matches

Mail list logo