Re: UBSAN report for main [so: 14] /usr/bin/whatis: non-zero (48) and zero offsets from null pointer in qsort.c

2022-01-11 Thread Jan Kokemüller
On 11.01.22 22:08, Stefan Esser wrote:
> diff --git a/lib/libc/stdlib/qsort.c b/lib/libc/stdlib/qsort.c
> index 5016fff7895f..51c41e802330 100644
> --- a/lib/libc/stdlib/qsort.c
> +++ b/lib/libc/stdlib/qsort.c
> @@ -108,6 +108,8 @@ local_qsort(void *a, size_t n, size_t es, cmp_t *cmp, void
> *thunk)
>   int cmp_result;
>   int swap_cnt;
> 
> + if (__predict_false(a == NULL))
> + return;
>  loop:
>   swap_cnt = 0;
>   if (n < 7) {
> 
> This would also work to prevent the NULL pointer arithmetik for
> ports that might also path a == NULL and n == 0 in certain cases.

The UB happens in this line, when "a == NULL" and "n == 0", right?

for (pm = (char *)a + es; pm < (char *)a + n * es; pm += es)

This is arithmetic on a pointer (the NULL pointer) which is not part of an
array, which is UB.

Then, wouldn't "if (__predict_false(n == 0))" be more appropriate than checking
for "a == NULL" here? Testing for "a == NULL" might suppress UBSAN warnings of
valid bugs, i.e. when "qsort" is called with "a == NULL" and "n != 0". In that
case UBSAN _should_ trigger.

UBSAN should not trigger when n == 0, though. At least, when "a" does point to
a valid array. But what about the case of "a == NULL && n == 0"? Is that deemed
UB? It looks like at least FreeBSD's "qsort_s" implementation says it's legal.

a != NULL (pointing to valid array), n != 0  ->  "normal" case, no UB
a != NULL (pointing to valid array), n == 0  ->  should not trigger UB, and
 doesn't in the current
 implementation
a == NULL, n == 0->  should not trigger UB?
 (debatable)

So if "a == NULL && n == 0" was deemed legal, then there would be no bug in
"mansearch.c", right?

-Jan



Re: armv7 regress fork-exec hangs machine

2022-01-11 Thread Alexander Bluhm
Works for me.  With this diff I get armv7 regress results again.
http://bluhm.genua.de/regress/results/regress-ot7.html

OK bluhm@

On Tue, Jan 11, 2022 at 01:10:27PM +0100, Mark Kettenis wrote:
> > Date: Mon, 10 Jan 2022 15:40:34 +0100 (CET)
> > From: Mark Kettenis 
> > 
> > > Date: Mon, 10 Jan 2022 14:40:50 +0100
> > > From: Alexander Bluhm 
> > > 
> > > On Thu, Jan 06, 2022 at 03:59:55PM +0100, Alexander Bluhm wrote:
> > > > My armv7 regress machine hangs every day in regress/sys/kern/fork-exit.
> > > 
> > > Maybe a show uvm provides some information.
> > 
> > Not really.  I can reproduce the issue here.  But I didn't have
> > ddb.console enabled :(.
> > 
> > > Stopped at  db_enter:   ldrbr15, [r15, r15, ror r15]!
> > > ddb> trace 
> > > db_enter
> > > rlv=0xc06bd178 rfp=0xcea1cdd0
> > > ampintc_irq_handler+0x13c
> > > rlv=0xc05b79c8 rfp=0xcea1ce48
> > > irq_entry+0x78
> > > rlv=0xc03ba3f8 rfp=0xcea1ce60
> > > uaddr_bestfit_insert+0x24
> > > rlv=0xc0654994 rfp=0xcea1ce78
> > > uvm_mapent_free_insert+0xa8
> > > rlv=0xc0657e7c rfp=0xcea1cea0
> > > uvm_map_fix_space+0x208
> > > rlv=0xc06577b4 rfp=0xcea1cec8
> > > uvm_map_kmem_grow+0x154
> > > rlv=0xc0657044 rfp=0xcea1cf48
> > > uvm_map+0x3a8
> > > rlv=0xc071e4d4 rfp=0xcea1cfa8
> > > uvm_km_thread+0x10c
> > > rlv=0xc064a060 rfp=0xc0a49f50
> > > Bad frame pointer: 0xc0a49f50
> 
> So as far as I can determine, we simply run out of KVA when running
> this test.  I'm not sure why though.  It could be fragmentation,
> although AFAICT the km thread only does page-sized allocations.  And
> we don't have guard pages turned on is it?  So maybe there is a leak
> somewhere...
> 
> That said, we have a really low amount of KVA on armv7.  It's
> basically 256MB plus what's left of the 64MB block we've loaded the
> kernel in.  Doubling this to 512MB (plus what's left of the 64MB
> block) makes the test pass, and brings us more in line with the other
> 32-bit platforms (i386 has 760MB of KVA).
> 
> ok?
> 
> 
> Index: arch/armv7/include/vmparam.h
> ===
> RCS file: /cvs/src/sys/arch/armv7/include/vmparam.h,v
> retrieving revision 1.6
> diff -u -p -r1.6 vmparam.h
> --- arch/armv7/include/vmparam.h  10 Mar 2017 08:42:08 -  1.6
> +++ arch/armv7/include/vmparam.h  11 Jan 2022 11:53:00 -
> @@ -62,7 +62,7 @@
>   */
>  #define  KERNEL_BASE ARM_KERNEL_BASE
>  
> -#define VM_KERNEL_SPACE_SIZE 0x1000
> +#define VM_KERNEL_SPACE_SIZE 0x2000
>  
>  /*
>   * Override the default pager_map size, there's not enough KVA.



Re: Acer Swift1 (SF114-34, N6000, Jasper Lake): iwx (ax201), azalia and emmc are not working/detected

2022-01-11 Thread Jonathan Gray
On Tue, Jan 11, 2022 at 09:06:42PM +0100, Sven Wolf wrote:
> 
> 
> On 1/11/22 01:42, Jonathan Gray wrote:
> > On Mon, Jan 10, 2022 at 08:56:18PM +0100, Sven Wolf wrote:
> > > 
> > > 
> > > On 1/10/22 02:02, Jonathan Gray wrote:
> > > > On Sun, Jan 09, 2022 at 09:27:57PM +0100, Sven Wolf wrote:
> > > > > Hi list,
> > > > > 
> > > > > in October'21 I successfully installed OpenBSD on this litte fanless 
> > > > > latop.
> > > > > There are following issues, even with -current:
> > > > > 
> > > > > The soundcard (Linux dmesg: snd_hda_codec_realtek hdaudioC0D0: 
> > > > > autoconfig
> > > > > for ALC256) is not detected.
> > > > > 
> > > > > The emmc (Linux dmesg: mmc0: new HS400 Enhanced strobe MMC card at 
> > > > > address
> > > > > 0001, mmcblk0: mmc0:0001 DA4128 116 GiB) can't get enabled.
> > > > > After I insert an nvme into the empty internal port I got OpenBSD 
> > > > > installed.
> > > > > 
> > > > > The AX201 is not detected. In October'21 I got the AX201 with 
> > > > > following
> > > > > patch in a stable working state:
> > > > > 
> > > > > cat pcidevs.diff
> > > > > *** pcidevs   Sun Jan  9 20:02:35 2022
> > > > > --- pcidevs.swift1Sun Jan  9 19:35:41 2022
> > > > > ***
> > > > > *** 5510,5515 
> > > > > --- 5510,5516 
> > > > > product INTEL RKL_GT_40x4c8c  UHD Graphics
> > > > > product INTEL RKL_GT_50x4c90  UHD Graphics
> > > > > product INTEL RKL_GT_60x4c9a  UHD Graphics
> > > > > + product INTEL WL_22500_60x4df0  Wi-Fi 6 AX201
> > > > > product INTEL JSL_GT_10x4e51  UHD Graphics
> > > > > product INTEL JSL_GT_20x4e55  UHD Graphics
> > > > > product INTEL JSL_GT_30x4e57  UHD Graphics
> > > > > 
> > > > > 
> > > > > cat if_iwx.c.diff
> > > > > *** if_iwx.c  Sun Jan  9 20:02:35 2022
> > > > > --- if_iwx.c.swift1   Sun Jan  9 19:37:41 2022
> > > > > ***
> > > > > *** 9177,9182 
> > > > > --- 9177,9183 
> > > > >   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_WL_22500_3 },
> > > > >   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_WL_22500_4,},
> > > > >   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_WL_22500_5,},
> > > > > + { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_WL_22500_6,},
> > > > > };
> > > > > 
> > > > > static const struct pci_matchid iwx_subsystem_id_ax201[] = {
> > > > > ***
> > > > > *** 9218,9223 
> > > > > --- 9219,9225 
> > > > >   case PCI_PRODUCT_INTEL_WL_22500_3: /* AX201 */
> > > > >   case PCI_PRODUCT_INTEL_WL_22500_4: /* AX201 */
> > > > >   case PCI_PRODUCT_INTEL_WL_22500_5: /* AX201 */
> > > > > + case PCI_PRODUCT_INTEL_WL_22500_6: /* AX201 */
> > > > >   for (i = 0; i < nitems(iwx_subsystem_id_ax201); i++) {
> > > > >   if (svid == iwx_subsystem_id_ax201[i].pm_vid &&
> > > > >   spid == iwx_subsystem_id_ax201[i].pm_pid)
> > > > > 
> > > > > But now I only get following message:
> > > > > iwx0 at pci0 dev 20 function 3 "Intel Wi-Fi 6 AX201" rev 0x01, msix
> > > > > iwx0: unknown adapter type
> > > > > 
> > > > > In Linux following firmware is used:
> > > > > Loading firmware: iwlwifi-QuZ-a0-hr-b0-63.ucode
> > > > > iwlwifi :00:14.3: loaded firmware version 63.c04f3485.0
> > > > > QuZ-a0-hr-b0-63.ucode op_mode iwlmvm
> > > > > 
> > > > > I hope that Stefan has an idea, how we can get the iwx on this 
> > > > > machine in a
> > > > > working state.
> > > > > 
> > > > > The azalia and emmc issues are not the highest priority for myself. 
> > > > > Maybe
> > > > > someone will make a patch.
> > > > > 
> > > > > The touchpad only works in PS/2 mode. Fortunately on this machine, the
> > > > > touchpad mode can be changed in the (hidden) BIOS/UEFI menu (CTRL+s)
> > > > 
> > > > The touchpad is likely connected via i2c.
> > > > 
> > > > The following diff should make audio and the touchpad work.
> > > > inteldrm will require the 5.15 port I'm working on.
> > 
> > committed with another id I missed added
> > 
> > does this help with emmc?
> > 
> > Index: sys/dev/pci/sdhc_pci.c
> > ===
> > RCS file: /cvs/src/sys/dev/pci/sdhc_pci.c,v
> > retrieving revision 1.21
> > diff -u -p -U4 -r1.21 sdhc_pci.c
> > --- sys/dev/pci/sdhc_pci.c  20 Nov 2019 16:34:58 -  1.21
> > +++ sys/dev/pci/sdhc_pci.c  11 Jan 2022 00:38:27 -
> > @@ -130,9 +130,10 @@ sdhc_pci_attach(struct device *parent, s
> > /* Some Intel controllers break if set to 0V bus power. */
> > if (PCI_VENDOR(pa->pa_id) == PCI_VENDOR_INTEL &&
> > (PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_INTEL_100SERIES_LP_EMMC ||
> > PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_INTEL_APOLLOLAKE_EMMC ||
> > -   PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_INTEL_GLK_EMMC))
> > +   PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_INTEL_GLK_EMMC ||
> > +   PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_INTEL_JSL_EMMC))
> > 

wireguard-related mbuf panic (was: Re: panic: ieee80211_has_seq(wh) assertion failed)

2022-01-11 Thread Paul de Weerd
Hi all,

More debugging later, it turns out that I can now pretty reliably
panic the GENERIC.MP kernel with a WireGuard tunnel that has an MTU of
1500 bytes, sending over the iwx(4) in my machine.

Christian Ehrhard (over at Genua) provided a diff that printf's when
there's a read-only mbuf passed to m_pullup:

Index: uipc_mbuf.c
===
RCS file: /home/OpenBSD/cvs/src/sys/kern/uipc_mbuf.c,v
retrieving revision 1.279
diff -u -p -r1.279 uipc_mbuf.c
--- uipc_mbuf.c 6 Mar 2021 09:20:49 -   1.279
+++ uipc_mbuf.c 11 Jan 2022 20:51:45 -
@@ -937,6 +937,10 @@ m_pullup(struct mbuf *m0, int len)
if (m == NULL)
goto freem0;
 
+   if (M_READONLY(m))
+   printf("BUG: Calling %s() on read-only mbuf cluster\n",
+   __func__);
+
head = M_DATABUF(m0);
if (m0->m_len == 0) {
m0->m_data = head;

This gets triggered A LOT when doing transfers over the wg(4)
interface.  The panic doesn't happen every time this condition is met
though, but seems easier to trigger with multiple parallel transfers.
Lowering the MTU of wg0 to 1400, the printf() is no longer called, and
the kernel no longer panics.

I changed Christian's diff to enter ddb (by calling assert(FALSE))
after the printf.  The resulting trace showed:

panic() at panic+0xbf
__assert() at __assert+0x25
m_pullup() at m_pullup+0x304
wg_input() at wg_input+0x17a
udp_sbappend() at udp_sbappend+0x78
...

Does this mean there's a problem with the call to m_pullup in wg_input
(if_wg.c:2031)?

Again, many thanks to stsp@ and Christian for their patience, diffs,
help and suggestions.

Paul

On Mon, Jan 10, 2022 at 10:05:14AM +0100, Paul de Weerd wrote:
| Hi all,
| 
| After many kernel rebuilds, data transfers, panics and a lot of
| back-and-forth with stsp@, and later also Christian Ehrhardt (many
| thanks for their patience to both of them!), we've found the
| following:
| 
| The problem started after the commit by stsp@ that implements Rx
| aggregation offload in iwm and iwx:
| 
| 
http://cvsweb.openbsd.org/src/sys/dev/pci/if_iwx.c?rev=1.53=text/x-cvsweb-markup
| 
| At the suggestion of Christian, I then ran with a diff that changes
| the m_copym in if_iwx.c:8651 to m_dup_pkt (which needed some more
| changes from Christian).  This "fixed" the issue (worked around it): I
| could no longer panic the machine.
| 
| Stefan then suggested disabling some other bits, as I have a vmm(4) VM
| on this laptop for which the host OS routes traffic and a wg(4) tunnel
| for IPv6 connectivity (both for the host OS and the VM).  Turns out
| that disabling the wg(4) tunnel prevents the system from panic'ing.
| It's still not easy to reproduce this issue, it sometimes takes
| several days (and many GBs of data transferred) before it triggers.
| 
| So it seems there's some kind of mbuf corruption going on.  Quoting
| from one of Stefan's e-mails:
| 
| > Not necessarily, since other NICs probably won't use multiple mbufs
| > to get a per-frame split-up view of a large DMA buffer that contains
| > multiple frames, like iwm/iwx do when firmware puts more than one
| > frame into this buffer.
| > 
| > I believe any of the mbufs which we layer on top of this large buffer
| > with m_split() might have been used by some other component of the
| > system before iwx grabs such an mbuf.
| > 
| > So I suspect there will either be a buffer overrun of some m->m_data
| > pointer which ends up writing into an adjacent mbuf cluster which happens
| > to be in use by iwx, or there will be a use-after-free write via an
| > m->m_data pointer which was previously used by some other network
| > stack component and now points to one of the frames in the DMA buffer.
| 
| I'm going to do some more experimenting with the wg setup to see if I
| can find more clues, but if anyone has any suggestions on what to try
| next, please share!
| 
| Thanks,
| 
| Paul
| 
| On Mon, Nov 08, 2021 at 01:57:53PM +0100, Paul de Weerd wrote:
| | Hi all,
| | 
| | After upgrading my laptop to a newer snapshot this weekend, I started
| | getting panics.   I was running OpenBSD 7.0-current (GENERIC.MP) #60:
| | Sun Oct 31 13:27:05 MDT 2021 before the upgrade.  Hand-typed from a
| | picture I took:
| | 
| | panic: kernel diagnostic assertion "ieee80211_has_seq(hw)" failed: file 
"/usr/src/sys/net80211/ieee80211_input.c", line 145
| | Stopped at  db_enter+0x10:  popq%rbp
| | TIDPIDUID PRFLAGS PFLAGS   CPU   COMMAND
| |  503471  517651070x100010  0x400 3   vmd
| |  333711  517651070x100010  0x400 1   vmd
| |  148566  72734  0 0x14000  0x200 2   drmwq
| | db_enter() at db_enter+0x10
| | panic(81e52e8a) at panic+0xbf
| | __assert(81ec2db8,81ef7c3b,91,81ecda4c) at 
__assert+0x25
| | ieee80211_get_hdrlen(fd805ce2967a) at ieee80211_get_hdrlen+0x8f
| | iwx_ccmp_decap(...) at 

Re: Acer Swift1 (SF114-34, N6000, Jasper Lake): iwx (ax201), azalia and emmc are not working/detected

2022-01-11 Thread Sven Wolf




On 1/11/22 01:42, Jonathan Gray wrote:

On Mon, Jan 10, 2022 at 08:56:18PM +0100, Sven Wolf wrote:



On 1/10/22 02:02, Jonathan Gray wrote:

On Sun, Jan 09, 2022 at 09:27:57PM +0100, Sven Wolf wrote:

Hi list,

in October'21 I successfully installed OpenBSD on this litte fanless latop.
There are following issues, even with -current:

The soundcard (Linux dmesg: snd_hda_codec_realtek hdaudioC0D0: autoconfig
for ALC256) is not detected.

The emmc (Linux dmesg: mmc0: new HS400 Enhanced strobe MMC card at address
0001, mmcblk0: mmc0:0001 DA4128 116 GiB) can't get enabled.
After I insert an nvme into the empty internal port I got OpenBSD installed.

The AX201 is not detected. In October'21 I got the AX201 with following
patch in a stable working state:

cat pcidevs.diff
*** pcidevs Sun Jan  9 20:02:35 2022
--- pcidevs.swift1  Sun Jan  9 19:35:41 2022
***
*** 5510,5515 
--- 5510,5516 
product INTEL RKL_GT_4  0x4c8c  UHD Graphics
product INTEL RKL_GT_5  0x4c90  UHD Graphics
product INTEL RKL_GT_6  0x4c9a  UHD Graphics
+ product INTEL WL_22500_60x4df0  Wi-Fi 6 AX201
product INTEL JSL_GT_1  0x4e51  UHD Graphics
product INTEL JSL_GT_2  0x4e55  UHD Graphics
product INTEL JSL_GT_3  0x4e57  UHD Graphics


cat if_iwx.c.diff
*** if_iwx.cSun Jan  9 20:02:35 2022
--- if_iwx.c.swift1 Sun Jan  9 19:37:41 2022
***
*** 9177,9182 
--- 9177,9183 
{ PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_WL_22500_3 },
{ PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_WL_22500_4,},
{ PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_WL_22500_5,},
+   { PCI_VENDOR_INTEL, PCI_PRODUCT_INTEL_WL_22500_6,},
};

static const struct pci_matchid iwx_subsystem_id_ax201[] = {
***
*** 9218,9223 
--- 9219,9225 
case PCI_PRODUCT_INTEL_WL_22500_3: /* AX201 */
case PCI_PRODUCT_INTEL_WL_22500_4: /* AX201 */
case PCI_PRODUCT_INTEL_WL_22500_5: /* AX201 */
+   case PCI_PRODUCT_INTEL_WL_22500_6: /* AX201 */
for (i = 0; i < nitems(iwx_subsystem_id_ax201); i++) {
if (svid == iwx_subsystem_id_ax201[i].pm_vid &&
spid == iwx_subsystem_id_ax201[i].pm_pid)

But now I only get following message:
iwx0 at pci0 dev 20 function 3 "Intel Wi-Fi 6 AX201" rev 0x01, msix
iwx0: unknown adapter type

In Linux following firmware is used:
Loading firmware: iwlwifi-QuZ-a0-hr-b0-63.ucode
iwlwifi :00:14.3: loaded firmware version 63.c04f3485.0
QuZ-a0-hr-b0-63.ucode op_mode iwlmvm

I hope that Stefan has an idea, how we can get the iwx on this machine in a
working state.

The azalia and emmc issues are not the highest priority for myself. Maybe
someone will make a patch.

The touchpad only works in PS/2 mode. Fortunately on this machine, the
touchpad mode can be changed in the (hidden) BIOS/UEFI menu (CTRL+s)


The touchpad is likely connected via i2c.

The following diff should make audio and the touchpad work.
inteldrm will require the 5.15 port I'm working on.


committed with another id I missed added

does this help with emmc?

Index: sys/dev/pci/sdhc_pci.c
===
RCS file: /cvs/src/sys/dev/pci/sdhc_pci.c,v
retrieving revision 1.21
diff -u -p -U4 -r1.21 sdhc_pci.c
--- sys/dev/pci/sdhc_pci.c  20 Nov 2019 16:34:58 -  1.21
+++ sys/dev/pci/sdhc_pci.c  11 Jan 2022 00:38:27 -
@@ -130,9 +130,10 @@ sdhc_pci_attach(struct device *parent, s
/* Some Intel controllers break if set to 0V bus power. */
if (PCI_VENDOR(pa->pa_id) == PCI_VENDOR_INTEL &&
(PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_INTEL_100SERIES_LP_EMMC ||
PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_INTEL_APOLLOLAKE_EMMC ||
-   PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_INTEL_GLK_EMMC))
+   PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_INTEL_GLK_EMMC ||
+   PCI_PRODUCT(pa->pa_id) == PCI_PRODUCT_INTEL_JSL_EMMC))
sc->sc.sc_flags |= SDHC_F_NOPWR0;
  
  	/* Some RICOH controllers need to be bumped into the right mode. */

if (PCI_VENDOR(pa->pa_id) == PCI_VENDOR_RICOH &&



Thanks :)
Also this patch works perfectly.
The emmc is now accessible.

Sven



Re: armv7 regress fork-exec hangs machine

2022-01-11 Thread Theo de Raadt
Yes, increase it.

I am not surprised, there are many poor recoveries from resource shortage.


Mark Kettenis  wrote:

> > Date: Mon, 10 Jan 2022 15:40:34 +0100 (CET)
> > From: Mark Kettenis 
> > 
> > > Date: Mon, 10 Jan 2022 14:40:50 +0100
> > > From: Alexander Bluhm 
> > > 
> > > On Thu, Jan 06, 2022 at 03:59:55PM +0100, Alexander Bluhm wrote:
> > > > My armv7 regress machine hangs every day in regress/sys/kern/fork-exit.
> > > 
> > > Maybe a show uvm provides some information.
> > 
> > Not really.  I can reproduce the issue here.  But I didn't have
> > ddb.console enabled :(.
> > 
> > > Stopped at  db_enter:   ldrbr15, [r15, r15, ror r15]!
> > > ddb> trace 
> > > db_enter
> > > rlv=0xc06bd178 rfp=0xcea1cdd0
> > > ampintc_irq_handler+0x13c
> > > rlv=0xc05b79c8 rfp=0xcea1ce48
> > > irq_entry+0x78
> > > rlv=0xc03ba3f8 rfp=0xcea1ce60
> > > uaddr_bestfit_insert+0x24
> > > rlv=0xc0654994 rfp=0xcea1ce78
> > > uvm_mapent_free_insert+0xa8
> > > rlv=0xc0657e7c rfp=0xcea1cea0
> > > uvm_map_fix_space+0x208
> > > rlv=0xc06577b4 rfp=0xcea1cec8
> > > uvm_map_kmem_grow+0x154
> > > rlv=0xc0657044 rfp=0xcea1cf48
> > > uvm_map+0x3a8
> > > rlv=0xc071e4d4 rfp=0xcea1cfa8
> > > uvm_km_thread+0x10c
> > > rlv=0xc064a060 rfp=0xc0a49f50
> > > Bad frame pointer: 0xc0a49f50
> 
> So as far as I can determine, we simply run out of KVA when running
> this test.  I'm not sure why though.  It could be fragmentation,
> although AFAICT the km thread only does page-sized allocations.  And
> we don't have guard pages turned on is it?  So maybe there is a leak
> somewhere...
> 
> That said, we have a really low amount of KVA on armv7.  It's
> basically 256MB plus what's left of the 64MB block we've loaded the
> kernel in.  Doubling this to 512MB (plus what's left of the 64MB
> block) makes the test pass, and brings us more in line with the other
> 32-bit platforms (i386 has 760MB of KVA).
> 
> ok?
> 
> 
> Index: arch/armv7/include/vmparam.h
> ===
> RCS file: /cvs/src/sys/arch/armv7/include/vmparam.h,v
> retrieving revision 1.6
> diff -u -p -r1.6 vmparam.h
> --- arch/armv7/include/vmparam.h  10 Mar 2017 08:42:08 -  1.6
> +++ arch/armv7/include/vmparam.h  11 Jan 2022 11:53:00 -
> @@ -62,7 +62,7 @@
>   */
>  #define  KERNEL_BASE ARM_KERNEL_BASE
>  
> -#define VM_KERNEL_SPACE_SIZE 0x1000
> +#define VM_KERNEL_SPACE_SIZE 0x2000
>  
>  /*
>   * Override the default pager_map size, there's not enough KVA.
> 



Re: UBSAN report for main [so: 14] /usr/bin/whatis: non-zero (48) and zero offsets from null pointer in qsort.c

2022-01-11 Thread Stefan Esser
Am 11.01.22 um 08:40 schrieb Mark Millard:
> # whatis dog
> /usr/main-src/lib/libc/stdlib/qsort.c:114:23: runtime error: applying 
> non-zero offset 48 to null pointer
> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
> /usr/main-src/lib/libc/stdlib/qsort.c:114:23 in 
> /usr/main-src/lib/libc/stdlib/qsort.c:114:44: runtime error: applying zero 
> offset to null pointer
> SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior 
> /usr/main-src/lib/libc/stdlib/qsort.c:114:44 in 
> whatis: nothing appropriate
> 
> This seems to be only for the not-found case.
> 
> ===
> Mark Millard
> marklmi at yahoo.com

The undefined behavior is caused by insufficient checking of parameters
in mansearch.c.

As part of the initializations performed at the start of mansearch(),
the variables cur and *res are initialized to 0 resp. NULL:

cur = maxres = 0;   
if (res != NULL)
*res = NULL;

If no match is found, these values are unchanged at line 223, where res
is checked to be non-NULL, but then *res is passed to qsort() and that
is still NULL.

Suggested fix (also attached to avoid white-space issues):

--- usr.bin/mandoc/mansearch.c
+++ usr.bin/mandoc/mansearch.c
@@ -220,7 +220,7 @@
if (cur && search->firstmatch)
break;
}
-   if (res != NULL)
+   if (res != NULL && *res != NULL)
qsort(*res, cur, sizeof(struct manpage), manpage_compare);
if (chdir_status && getcwd_status && chdir(buf) == -1)
warn("%s", buf);

(File name as in OpenBSD, it is contrib/mandoc/mansearch.c in FreeBSD.)

Regards, STefan--- usr.bin/mandoc/mansearch.c
+++ usr.bin/mandoc/mansearch.c
@@ -220,7 +220,7 @@
if (cur && search->firstmatch)
break;
}
-   if (res != NULL)
+   if (res != NULL && *res != NULL)
qsort(*res, cur, sizeof(struct manpage), manpage_compare);
if (chdir_status && getcwd_status && chdir(buf) == -1)
warn("%s", buf);


OpenPGP_signature
Description: OpenPGP digital signature


Re: armv7 regress fork-exec hangs machine

2022-01-11 Thread Mark Kettenis
> Date: Mon, 10 Jan 2022 15:40:34 +0100 (CET)
> From: Mark Kettenis 
> 
> > Date: Mon, 10 Jan 2022 14:40:50 +0100
> > From: Alexander Bluhm 
> > 
> > On Thu, Jan 06, 2022 at 03:59:55PM +0100, Alexander Bluhm wrote:
> > > My armv7 regress machine hangs every day in regress/sys/kern/fork-exit.
> > 
> > Maybe a show uvm provides some information.
> 
> Not really.  I can reproduce the issue here.  But I didn't have
> ddb.console enabled :(.
> 
> > Stopped at  db_enter:   ldrbr15, [r15, r15, ror r15]!
> > ddb> trace 
> > db_enter
> > rlv=0xc06bd178 rfp=0xcea1cdd0
> > ampintc_irq_handler+0x13c
> > rlv=0xc05b79c8 rfp=0xcea1ce48
> > irq_entry+0x78
> > rlv=0xc03ba3f8 rfp=0xcea1ce60
> > uaddr_bestfit_insert+0x24
> > rlv=0xc0654994 rfp=0xcea1ce78
> > uvm_mapent_free_insert+0xa8
> > rlv=0xc0657e7c rfp=0xcea1cea0
> > uvm_map_fix_space+0x208
> > rlv=0xc06577b4 rfp=0xcea1cec8
> > uvm_map_kmem_grow+0x154
> > rlv=0xc0657044 rfp=0xcea1cf48
> > uvm_map+0x3a8
> > rlv=0xc071e4d4 rfp=0xcea1cfa8
> > uvm_km_thread+0x10c
> > rlv=0xc064a060 rfp=0xc0a49f50
> > Bad frame pointer: 0xc0a49f50

So as far as I can determine, we simply run out of KVA when running
this test.  I'm not sure why though.  It could be fragmentation,
although AFAICT the km thread only does page-sized allocations.  And
we don't have guard pages turned on is it?  So maybe there is a leak
somewhere...

That said, we have a really low amount of KVA on armv7.  It's
basically 256MB plus what's left of the 64MB block we've loaded the
kernel in.  Doubling this to 512MB (plus what's left of the 64MB
block) makes the test pass, and brings us more in line with the other
32-bit platforms (i386 has 760MB of KVA).

ok?


Index: arch/armv7/include/vmparam.h
===
RCS file: /cvs/src/sys/arch/armv7/include/vmparam.h,v
retrieving revision 1.6
diff -u -p -r1.6 vmparam.h
--- arch/armv7/include/vmparam.h10 Mar 2017 08:42:08 -  1.6
+++ arch/armv7/include/vmparam.h11 Jan 2022 11:53:00 -
@@ -62,7 +62,7 @@
  */
 #defineKERNEL_BASE ARM_KERNEL_BASE
 
-#define VM_KERNEL_SPACE_SIZE   0x1000
+#define VM_KERNEL_SPACE_SIZE   0x2000
 
 /*
  * Override the default pager_map size, there's not enough KVA.