Re: [www] fix broken links on loongson.html

2023-08-05 Thread Moritz Buhl
Hi,

I noticed this today and commited your change.

mbuhl

On Tue, Jun 02, 2020 at 12:33:36PM +, Yifei Zhan wrote:
> Hi,
> 
> Several links on loongson.html are now broken as zkml.lemote.com is no
> longer being resolved.
> 
> I think it's a good idea to redirect them to their archived version on
> archive.org.
> 
> Cheers,
> Yifei Zhan
> 
> 

> Index: loongson.html
> ===
> RCS file: /cvs/www/loongson.html,v
> retrieving revision 1.66
> diff -u -p -r1.66 loongson.html
> --- loongson.html 19 May 2020 02:02:10 -  1.66
> +++ loongson.html 31 May 2020 06:55:56 -
> @@ -73,16 +73,16 @@ Specific hardware support is then writte
>  At the moment, the following machines are supported:
>  
>  
> - href="http://zkml.lemote.com/en/products/mini-computer/2010/0310/111.html;>Lemote
>  Fuloong 2F
> + href="https://web.archive.org/web/20190101154956/http://zkml.lemote.com/en/products/mini-computer/2010/0310/111.html;>Lemote
>  Fuloong 2F
>  mini-PC
>  
>  All on-board devices are supported, but the framebuffer is currently limited
>  to the 640x400x8 video mode set up by the firmware.
> - href="http://zkml.lemote.com/en/products/all-in-one/2010/0311/122.html;>Lemote
>  Lynloong all-in-one-PC
> + href="https://web.archive.org/web/20190101150011/http://zkml.lemote.com/en/products/all-in-one/2010/0311/122.html;>Lemote
>  Lynloong all-in-one-PC
>  
>  All on-board devices are supported, but the framebuffer is currently limited
>  to the 1360x768x16 video mode set up by the firmware.
> - href="http://zkml.lemote.com/en/products/Notebook/2010/0310/112.html;>Lemote 
> Yeeloong
> + href="https://web.archive.org/web/20160703160118/http://zkml.lemote.com/en/products/Notebook/2010/0310/112.html;>Lemote
>  Yeeloong
>  netbook
>  
>  Both the 8.9" and 10.1" models are supported.



use designated initializer for ffs_vtbl

2023-04-14 Thread Moritz Buhl
Dear tech,

while reading through ffs code I was looking for calls to ffs_truncate.
This would have saved me a few seconds.

OK?
mbuhl

Index: ufs/ffs/ffs_vfsops.c
===
RCS file: /cvs/src/sys/ufs/ffs/ffs_vfsops.c,v
retrieving revision 1.193
diff -u -p -r1.193 ffs_vfsops.c
--- ufs/ffs/ffs_vfsops.c12 Aug 2022 14:30:53 -  1.193
+++ ufs/ffs/ffs_vfsops.c14 Apr 2023 11:35:16 -
@@ -89,12 +89,12 @@ const struct vfsops ffs_vfsops = {
 };
 
 struct inode_vtbl ffs_vtbl = {
-   ffs_truncate,
-   ffs_update,
-   ffs_inode_alloc,
-   ffs_inode_free,
-   ffs_balloc,
-   ffs_bufatoff
+   .iv_truncate= ffs_truncate,
+   .iv_update  = ffs_update,
+   .iv_inode_alloc = ffs_inode_alloc,
+   .iv_inode_free  = ffs_inode_free,
+   .iv_buf_alloc   = ffs_balloc,
+   .iv_bufatoff= ffs_bufatoff
 };
 
 int



Re: installer: rpi: make softraid install work

2023-04-04 Thread Moritz Buhl
On Mon, Apr 03, 2023 at 05:53:49PM +, Klemens Nanni wrote:
> 03.04.2023 15:03, Klemens Nanni пишет:
> > This means no behaviour change for plain installs, but working boot for
> > softraid installs.
> > 
> > 'cvs diff -b -U0':
> > 
> > @@ -39 +39 @@ md_installboot() {
> > -   local _disk=$1 _mdec _plat
> > +   local _disk=$1 _chunks _bootdisk _mdec _plat
> > @@ -62 +62,3 @@ md_installboot() {
> > -   mount ${MOUNT_ARGS_msdos} /dev/${_disk}i /mnt/mnt
> > +   _chunks=$(get_softraid_chunks)
> > +   for _bootdisk in ${_chunks:-$_disk}; do
> > +   mount ${MOUNT_ARGS_msdos} /dev/${_bootdisk}i 
> > /mnt/mnt
> > @@ -74,0 +77 @@ md_installboot() {
> > +   done
> > 
> > (I have yet to dig out my rpi4, fix the console cable and test myself...)
> 
> As expected:
> - Plain  install on wiped SD card boots.
> - Crypto install on wiped SD card boots.
> 
> No other boards are effected by this diff.
> 
> OK?

I tested this on a rpi zero 2 w (no UEFI Firmware):
machdep.compatible=raspberrypi,model-zero-2-w

I can confirm that without this diff it tries to mount the msdos
partition of the SR CRYPTO disk.  After applying this diff the
installer uses the chunk disk instead.

ok mbuhl



Re: panic: ffs_valloc: dup alloc

2023-03-17 Thread Moritz Buhl
Below is an updated diff addressing the following concerns:

On Fri, Mar 17, 2023 at 03:45:59PM +0100, Alexander Bluhm wrote:
> The FreeBSD Code puts the check_nifree label before the if
> (fs->fs_cs(fs, cg).cs_nifree == 0) check.  Could you move this chunk
> a little up to keep us in sync?

On Fri, Mar 17, 2023 at 03:57:39PM +0100, Mark Kettenis wrote:
> Don't hide a global variable like that!

There are also some concerns on the string "indallck",
to my understanding the first 7 characters show up as wchan in ps
and should be easy to grep for.
Any better names?
"ffs-node-lock"
"ffsndlock"

Index: ffs_alloc.c
===
RCS file: /cvs/src/sys/ufs/ffs/ffs_alloc.c,v
retrieving revision 1.114
diff -u -p -r1.114 ffs_alloc.c
--- ffs_alloc.c 11 Mar 2021 13:31:35 -  1.114
+++ ffs_alloc.c 17 Mar 2023 19:28:11 -
@@ -46,6 +46,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -75,6 +76,7 @@ daddr_t   ffs_mapsearch(struct fs *, stru
 
 static const struct timevalfserr_interval = { 2, 0 };
 
+struct rwlock ffs_node_lock = RWLOCK_INITIALIZER("indallck");
 
 /*
  * Allocate a block in the file system.
@@ -1108,6 +1110,9 @@ ffs_nodealloccg(struct inode *ip, u_int 
 * and in the cylinder group itself.
 */
fs = ip->i_fs;
+#ifdef FFS2
+ check_nifree:
+#endif
if (fs->fs_cs(fs, cg).cs_nifree == 0)
return (0);
 
@@ -1201,9 +1206,12 @@ gotit:
/* Has any inode not been used at least once? */
cgp->cg_initediblk < cgp->cg_ffs2_niblk) {
 
+   if (rw_enter(_node_lock, RW_WRITE | RW_SLEEPFAIL))
+   goto check_nifree;
 ibp = getblk(ip->i_devvp, fsbtodb(fs,
 ino_to_fsba(fs, cg * fs->fs_ipg + cgp->cg_initediblk)),
 (int)fs->fs_bsize, 0, INFSLP);
+   rw_exit(_node_lock);
 
 memset(ibp->b_data, 0, fs->fs_bsize);
 dp2 = (struct ufs2_dinode *)(ibp->b_data);



panic: ffs_valloc: dup alloc

2023-03-17 Thread Moritz Buhl
Dear tech,

I have a machine with some kind of hardware defect.
Smartctl shows that one SSD regularly has an unexpected power loss
(Attribute 174):

174 Unknown_Attribute   0x0032   100   100   000Old_age
Always   -   284.

On the console I see the following output:

root on sd0a (b75e60dd651bd99e.a) swap on sd0b dump on sd0b
ahci0: device didn't come ready after reset, TFD: 0x441
ahci0: unrecoverable errors (IS: 10), disabling port.
ahci0: device didn't come ready after reset, TFD: 0x1d0
ahci0: stopping the port, softreset slot 31 was still active.
ahci0: device didn't come ready after reset, TFD: 0x441
ahci0: device didn't come ready after reset, TFD: 0x441
ahci0: stopping the port, softreset slot 31 was still active.
ahci0: device didn't come ready after reset, TFD: 0x441
ahci0: stopping the port, softreset slot 31 was still active.
ahci0: stopping the port, softreset slot 31 was still active.
ahci0: stopping the port, softreset slot 31 was still active.
ahci0: stopping the port, softreset slot 31 was still active.
ahci0: stopping the port, softreset slot 31 was still active.
ahci0: stopping the port, softreset slot 31 was still active.
ahci0: device didn't come ready after reset, TFD: 0x441
ahci0: device didn't come ready after reset, TFD: 0x441
mode = 0100644, inum = 7731315, fs = /var
panic: ffs_valloc: dup alloc
Stopped at  db_enter+0x10:  popq%rbp
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
 175271  46390   1001 0x2  0x4000  rustc
 345274  46390   1001 0x2  0x4001  rustc
 224769  52321   1001 0x2  0x4005  rustc
*414814  59107   1001  0x1002  0x4082K rustc
  87951  59107   1001  0x1002  0x4084  rustc
db_enter() at db_enter+0x10
panic(81f194f6) at panic+0xbf
ffs_inode_alloc(fd87b0b8c1e0,81a4,fd8fc6cfc5d0,800020a7a948) at ffs
_inode_alloc+0x42e
ufs_makeinode(81a4,fd87ab540438,800020a7ac40,800020a7ac70) at ufs_m
akeinode+0x79
ufs_create(800020a7a9f8) at ufs_create+0x3c
VOP_CREATE(fd87ab540438,800020a7ac40,800020a7ac70,800020a7aa50)
 at VOP_CREATE+0x3f
vn_open(800020a7ac10,10602,1a4) at vn_open+0x162
doopenat(800020879cf0,ff9c,246e5646200,10601,1b6,800020a7adf0) at d
oopenat+0x1cd
syscall(800020a7ae60) at syscall+0x35f
Xsyscall() at Xsyscall+0x128
end of kernel

ffs_inode_alloc checks the mode of the newly allocated inode and
if it is not 0 it panics.
I suspect that due to the ssds softreset that IO operations take
longer and cause trouble due to missing locking.

For example in ffs_nodealloccg it is possible for an FFS2 file
system to sleep due to a getblk. During the sleep, another call to
ffs_nodealloccg might happen with the same ipref.
The function first checks if there is space at the prefered location,
then sleeps and then reserves it. This is a TOCTOU bug.

FreeBSD fixed this in 2013:
https://github.com/freebsd/freebsd-src/commit/94f4ac214c403b99cbc44dd6e9cdf4db2cc9b297

Below is a diff that addresses this with a exclusive lock
using RW_SLEEPFAIL.

Index: ufs/ffs/ffs_alloc.c
===
RCS file: /cvs/src/sys/ufs/ffs/ffs_alloc.c,v
retrieving revision 1.114
diff -u -p -r1.114 ffs_alloc.c
--- ufs/ffs/ffs_alloc.c 11 Mar 2021 13:31:35 -  1.114
+++ ufs/ffs/ffs_alloc.c 10 Mar 2023 15:44:52 -
@@ -46,6 +46,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1089,6 +1090,7 @@ gotit:
 }
 
 /* inode allocation routine */
+struct rwlock ffs_node_lock = RWLOCK_INITIALIZER("indallck");
 daddr_t
 ffs_nodealloccg(struct inode *ip, u_int cg, daddr_t ipref, int mode)
 {
@@ -1115,6 +1117,9 @@ ffs_nodealloccg(struct inode *ip, u_int 
return (0);
 
cgp = (struct cg *)bp->b_data;
+#ifdef FFS2
+ check_nifree:
+#endif
if (cgp->cg_cs.cs_nifree == 0) {
brelse(bp);
return (0);
@@ -1201,9 +1206,12 @@ gotit:
/* Has any inode not been used at least once? */
cgp->cg_initediblk < cgp->cg_ffs2_niblk) {
 
+   if (rw_enter(_node_lock, RW_WRITE | RW_SLEEPFAIL))
+   goto check_nifree;
 ibp = getblk(ip->i_devvp, fsbtodb(fs,
 ino_to_fsba(fs, cg * fs->fs_ipg + cgp->cg_initediblk)),
 (int)fs->fs_bsize, 0, INFSLP);
+   rw_exit(_node_lock);
 
 memset(ibp->b_data, 0, fs->fs_bsize);
 dp2 = (struct ufs2_dinode *)(ibp->b_data);



PPPoE with PAP and CHAP

2023-03-13 Thread Moritz Buhl
Hi gerhard and tech,

You wrote the following in 2015 and I was wondering if it was worth
upstreaming it:

The PPPoE kernel module gets a new option auto for the authproto
in addition to pap and chap in client mode.

In this case the authentication protocol sent by the server in the
LCP configure request is used.

by gerhard@

Index: sbin/ifconfig/ifconfig.8
===
RCS file: /cvs/src/sbin/ifconfig/ifconfig.8,v
retrieving revision 1.390
diff -u -p -r1.390 ifconfig.8
--- sbin/ifconfig/ifconfig.827 Feb 2023 14:53:38 -  1.390
+++ sbin/ifconfig/ifconfig.810 Mar 2023 19:56:33 -
@@ -1693,6 +1693,7 @@ interface acting as a client.
 The protocol name can be either
 .Ql chap ,
 .Ql pap ,
+.Ql auto ,
 or
 .Ql none .
 In the latter case, authentication will be turned off.
Index: sbin/ifconfig/ifconfig.c
===
RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v
retrieving revision 1.461
diff -u -p -r1.461 ifconfig.c
--- sbin/ifconfig/ifconfig.c18 Jan 2023 21:57:10 -  1.461
+++ sbin/ifconfig/ifconfig.c10 Mar 2023 19:56:33 -
@@ -5481,6 +5481,8 @@ setsroto(const char *val, int d)
spa.proto = PPP_PAP;
else if (strcmp(val, "chap") == 0)
spa.proto = PPP_CHAP;
+   else if (d == 0 && strcmp(val, "auto") == 0)
+   spa.proto = PPP_AUTOAUTH;
else if (strcmp(val, "none") == 0)
spa.proto = 0;
else
@@ -5588,6 +5590,9 @@ sppp_printproto(const char *name, struct
break;
case PPP_CHAP:
printf("chap ");
+   break;
+   case PPP_AUTOAUTH:
+   printf("auto ");
break;
default:
printf("0x%04x ", auth->proto);
Index: sys/net/if_sppp.h
===
RCS file: /cvs/src/sys/net/if_sppp.h,v
retrieving revision 1.30
diff -u -p -r1.30 if_sppp.h
--- sys/net/if_sppp.h   17 Nov 2021 18:00:24 -  1.30
+++ sys/net/if_sppp.h   10 Mar 2023 19:56:33 -
@@ -40,6 +40,8 @@
 #ifndef _NET_IF_SPPP_H_
 #define _NET_IF_SPPP_H_
 
+#define PPP_AUTOAUTH   0xffee  /* automatic auth protocol selection */
+
 #define AUTHFLAG_NOCALLOUT 1 /* don't require authentication on callouts */
 #define AUTHFLAG_NORECHALLENGE 2 /* don't re-challenge CHAP */
 
@@ -185,6 +187,7 @@ struct sppp {
struct sipcp ipv6cp;/* IPV6CP params */
struct sauth myauth;/* auth params, i'm peer */
struct sauth hisauth;   /* auth params, i'm authenticator */
+   u_short peerproto;  /* auth protocol requested by peer */
u_char chap_challenge[AUTHCHALEN]; /* random challenge used by CHAP */
 
/*
Index: sys/net/if_spppsubr.c
===
RCS file: /cvs/src/sys/net/if_spppsubr.c,v
retrieving revision 1.191
diff -u -p -r1.191 if_spppsubr.c
--- sys/net/if_spppsubr.c   2 Jan 2022 22:36:03 -   1.191
+++ sys/net/if_spppsubr.c   10 Mar 2023 19:57:55 -
@@ -1894,11 +1894,16 @@ sppp_lcp_RCR(struct sppp *sp, struct lcp
 
case LCP_OPT_AUTH_PROTO:
authproto = (p[2] << 8) + p[3];
-   if (sp->myauth.proto != authproto) {
+   if (sp->myauth.proto == PPP_AUTOAUTH) {
+   if (debug)
+   addlog("[peer wants %s] ",
+  sppp_proto_name(authproto));
+   sp->peerproto = authproto;
+   } else if (sp->myauth.proto != authproto) {
/* not agreed, nak */
if (debug)
addlog("[mine %s != his %s] ",
-  
sppp_proto_name(sp->hisauth.proto),
+  
sppp_proto_name(sp->myauth.proto),
   sppp_proto_name(authproto));
p[2] = sp->myauth.proto >> 8;
p[3] = sp->myauth.proto;
@@ -3409,7 +3414,8 @@ sppp_chap_input(struct sppp *sp, struct 
}
x = splnet();
sp->pp_flags &= ~PP_NEEDAUTH;
-   if (sp->myauth.proto == PPP_CHAP &&
+   if ((sp->myauth.proto == PPP_CHAP ||
+   sp->myauth.proto == PPP_AUTOAUTH) &&
(sp->lcp.opts & (1 << LCP_OPT_AUTH_PROTO)) &&
(sp->lcp.protos & (1 << IDX_CHAP)) == 0) {
/*
@@ -3558,7 +3564,8 @@ sppp_chap_init(struct sppp *sp)
 void
 sppp_chap_open(struct sppp *sp)
 {
-   if (sp->myauth.proto == PPP_CHAP &&
+   if ((sp->myauth.proto == PPP_CHAP ||
+   

mountd: potential use of uninitialized stack data

2023-02-22 Thread Moritz Buhl
Dear tech@,

codechecker found the following problem with fsb in sbin/mountd:

mntsrv(...)
...
struct statfs fsb;
...
if (realpath(rpcpath, dirpath) == NULL) {
bad = errno;
if (debug)
fprintf(stderr, "realpath failed on %s\n",
rpcpath);
strlcpy(dirpath, rpcpath, sizeof(dirpath));
} else if (stat(dirpath, ) == -1 ||
(!S_ISDIR(stb.st_mode) && !S_ISREG(stb.st_mode)) ||
statfs(dirpath, ) == -1) {
if (debug)
fprintf(stderr, "stat failed on %s\n", dirpath);
bad = ENOENT;   /* We will send error reply later */
}

/* Check in the exports list */
sigprocmask(SIG_BLOCK, _mask, NULL);
ep = ex_search(_fsid);
...

The tool finds a path to ex_search where fsb.f_fsid is uninitialized.

ex_search compares the potentially uninitialized stack data:

ex_search(fsid_t *fsid)
{
struct exportlist *ep;

ep = exphead;
while (ep) {
if (ep->ex_fs.val[0] == fsid->val[0] &&
...

Is it sufficient to zero fsb?
Is this really reachable?

mbuhl



igc(4) remove unnecessary PHY ID checks

2023-01-25 Thread Moritz Buhl
Dear tech,

Looking into various reports on errors with mini routers that use
igc(4), I found the following diff in FreeBSD:
https://github.com/freebsd/freebsd-src/commit/29d7f1ff579579711dd5a3325480728b8ed45f8c

I additionally deleted the igc_set_d0_lplu_state_i225 and
igc_set_d3_lplu_state_i225 functions since they are no longer used.

I am not convinced that this diff really fixes any of the issues,
however I do appreciate feedback.

Ok?

mbuhl

Index: dev/pci/igc_i225.c
===
RCS file: /cvs/src/sys/dev/pci/igc_i225.c,v
retrieving revision 1.3
diff -u -p -r1.3 igc_i225.c
--- dev/pci/igc_i225.c  11 May 2022 06:14:15 -  1.3
+++ dev/pci/igc_i225.c  10 Jan 2023 09:26:04 -
@@ -163,16 +163,7 @@ igc_init_phy_params_i225(struct igc_hw *
goto out;
 
ret_val = igc_get_phy_id(hw);
-   /* Verify phy id and set remaining function pointers */
-   switch (phy->id) {
-   case I225_I_PHY_ID:
-   default:
-   phy->type = igc_phy_i225;
-   phy->ops.set_d0_lplu_state = igc_set_d0_lplu_state_i225;
-   phy->ops.set_d3_lplu_state = igc_set_d3_lplu_state_i225;
-   /* TODO - complete with GPY PHY information */
-   break;
-   }
+   phy->type = igc_phy_i225;
 
 out:
return ret_val;
@@ -1104,66 +1095,6 @@ igc_init_hw_i225(struct igc_hw *hw)
 
ret_val = igc_init_hw_base(hw);
return ret_val;
-}
-
-/*
- * igc_set_d0_lplu_state_i225 - Set Low-Power-Link-Up (LPLU) D0 state
- * @hw: pointer to the HW structure
- * @active: true to enable LPLU, false to disable
- *
- * Note: since I225 does not actually support LPLU, this function
- * simply enables/disables 1G and 2.5G speeds in D0.
- */
-int
-igc_set_d0_lplu_state_i225(struct igc_hw *hw, bool active)
-{
-   uint32_t data;
-
-   DEBUGFUNC("igc_set_d0_lplu_state_i225");
-
-   data = IGC_READ_REG(hw, IGC_I225_PHPM);
-
-   if (active) {
-   data |= IGC_I225_PHPM_DIS_1000;
-   data |= IGC_I225_PHPM_DIS_2500;
-   } else {
-   data &= ~IGC_I225_PHPM_DIS_1000;
-   data &= ~IGC_I225_PHPM_DIS_2500;
-   }
-
-   IGC_WRITE_REG(hw, IGC_I225_PHPM, data);
-   return IGC_SUCCESS;
-}
-
-/*
- * igc_set_d3_lplu_state_i225 - Set Low-Power-Link-Up (LPLU) D3 state
- * @hw: pointer to the HW structure
- * @active: true to enable LPLU, false to disable
- *
- * Note: since I225 does not actually support LPLU, this function
- * simply enables/disables 100M, 1G and 2.5G speeds in D3.
- */
-int
-igc_set_d3_lplu_state_i225(struct igc_hw *hw, bool active)
-{
-   uint32_t data;
-
-   DEBUGFUNC("igc_set_d3_lplu_state_i225");
-
-   data = IGC_READ_REG(hw, IGC_I225_PHPM);
-
-   if (active) {
-   data |= IGC_I225_PHPM_DIS_100_D3;
-   data |= IGC_I225_PHPM_DIS_1000_D3;
-   data |= IGC_I225_PHPM_DIS_2500_D3;
-   } else {
-   data &= ~IGC_I225_PHPM_DIS_100_D3;
-   data &= ~IGC_I225_PHPM_DIS_1000_D3;
-   data &= ~IGC_I225_PHPM_DIS_2500_D3;
-   }
-
-   IGC_WRITE_REG(hw, IGC_I225_PHPM, data);
-   return IGC_SUCCESS;
 }
 
 /**
Index: dev/pci/igc_i225.h
===
RCS file: /cvs/src/sys/dev/pci/igc_i225.h,v
retrieving revision 1.1
diff -u -p -r1.1 igc_i225.h
--- dev/pci/igc_i225.h  31 Oct 2021 14:52:57 -  1.1
+++ dev/pci/igc_i225.h  10 Jan 2023 09:37:40 -
@@ -27,8 +27,6 @@ void  igc_release_swfw_sync_i225(struct i
 intigc_set_ltr_i225(struct igc_hw *, bool);
 intigc_init_hw_i225(struct igc_hw *);
 intigc_setup_copper_link_i225(struct igc_hw *);
-intigc_set_d0_lplu_state_i225(struct igc_hw *, bool);
-intigc_set_d3_lplu_state_i225(struct igc_hw *, bool);
 intigc_set_eee_i225(struct igc_hw *, bool, bool, bool);
 
 #define ID_LED_DEFAULT_I225\
Index: dev/pci/igc_phy.c
===
RCS file: /cvs/src/sys/dev/pci/igc_phy.c,v
retrieving revision 1.2
diff -u -p -r1.2 igc_phy.c
--- dev/pci/igc_phy.c   11 May 2022 06:14:15 -  1.2
+++ dev/pci/igc_phy.c   10 Jan 2023 09:34:11 -
@@ -307,8 +307,7 @@ igc_phy_setup_autoneg(struct igc_hw *hw)
return ret_val;
}
 
-   if ((phy->autoneg_mask & ADVERTISE_2500_FULL) &&
-   hw->phy.id == I225_I_PHY_ID) {
+   if (phy->autoneg_mask & ADVERTISE_2500_FULL) {
/* Read the MULTI GBT AN Control Register - reg 7.32 */
ret_val = phy->ops.read_reg(hw, (STANDARD_AN_REG_MASK <<
MMD_DEVADDR_SHIFT) | ANEG_MULTIGBT_AN_CTRL,
@@ -443,8 +442,7 @@ igc_phy_setup_autoneg(struct igc_hw *hw)
ret_val = phy->ops.write_reg(hw, PHY_1000T_CTRL,
mii_1000t_ctrl_reg);
 
-   if ((phy->autoneg_mask & 

MAP_CONCEAL bypass due to stack overflow

2022-11-11 Thread Moritz Buhl
Dear tech,

currently the MAP_CONCEAL flag for mmap(2) can be bypassed with a
stack overflow. FreeBSD introduced a similar flag called MAP_NOCORE.
They also save the auxinfo pointer in some struct that is attached
to the proc struct, similar to the diff below.

Here is an example CTF challenge:

/*
 * Setup:
 * make mmap
 * echo 'this is a well kept secret' > flag.txt
 * printf 'flag.txt\0' | ./mmap
 * strings mmap.core | grep secret
 */

#include 

#include 
#include 
#include 
#include 

char *
load_file(char *path)
{
char *addr;
int fd;

fd = open(path, O_RDONLY);
if ((addr = mmap((char *)0x4, 0x100, PROT_READ, MAP_FIXED
| MAP_CONCEAL, fd, 0)) == MAP_FAILED)
err(1, NULL);
close(fd);

return (addr);
}

int
main(void)
{
char file[0x100];
char *buf;

puts("Content-type: text/html\n");
puts("simple mmapPOST filename to /cgi-bin/mmap");
read(0, file, 0x1000);
buf = load_file(file);
printf("load: %p\n", buf);
fflush(stdout);

return (0);
}


With the following python program (on amd64) I can generate an input
string that writes the MAP_CONCEALED content of flag.txt to the
core file:

import struct
p64 = lambda x: struct.pack("p_p;
@@ -1282,23 +1279,7 @@ coredump_notes_elf(struct proc *p, void 
/* Second, write an NT_OPENBSD_AUXV note. */
notesize = sizeof(nhdr) + elfround(sizeof("OpenBSD")) +
elfround(ELF_AUX_WORDS * sizeof(char *));
-   if (iocookie) {
-   iov.iov_base = 
-   iov.iov_len = sizeof(pss);
-   uio.uio_iov = 
-   uio.uio_iovcnt = 1;
-   uio.uio_offset = (off_t)pr->ps_strings;
-   uio.uio_resid = sizeof(pss);
-   uio.uio_segflg = UIO_SYSSPACE;
-   uio.uio_rw = UIO_READ;
-   uio.uio_procp = NULL;
-
-   error = uvm_io(>p_vmspace->vm_map, , 0);
-   if (error)
-   return (error);
-
-   if (pss.ps_envstr == NULL)
-   return (EIO);
+   if (iocookie && pr->ps_auxinfo) {
 
nhdr.namesz = sizeof("OpenBSD");
nhdr.descsz = ELF_AUX_WORDS * sizeof(char *);
@@ -1315,7 +1296,7 @@ coredump_notes_elf(struct proc *p, void 
return (error);
 
error = coredump_write(iocookie, UIO_USERSPACE,
-   pss.ps_envstr + pss.ps_nenvstr + 1, nhdr.descsz);
+   (caddr_t)pr->ps_auxinfo, nhdr.descsz);
if (error)
return (error);
}
Index: kern/kern_exec.c
===
RCS file: /mount/openbsd/cvs/src/sys/kern/kern_exec.c,v
retrieving revision 1.238
diff -u -p -r1.238 kern_exec.c
--- kern/kern_exec.c30 Oct 2022 17:43:40 -  1.238
+++ kern/kern_exec.c6 Nov 2022 10:37:19 -
@@ -492,6 +492,8 @@ sys_execve(struct proc *p, void *v, regi
if (!copyargs(, , stack, argp))
goto exec_abort;
 
+   pr->ps_auxinfo = (vaddr_t)pack.ep_auxinfo;
+
/* copy out the process's ps_strings structure */
if (copyout(, (char *)pr->ps_strings, sizeof(arginfo)))
goto exec_abort;
Index: sys/proc.h
===
RCS file: /mount/openbsd/cvs/src/sys/sys/proc.h,v
retrieving revision 1.334
diff -u -p -r1.334 proc.h
--- sys/proc.h  23 Jul 2022 22:10:59 -  1.334
+++ sys/proc.h  6 Nov 2022 10:37:19 -
@@ -215,6 +215,7 @@ struct process {
charps_comm[_MAXCOMLEN];/* command name, incl NUL */
 
vaddr_t ps_strings; /* User pointers to argv/env */
+   vaddr_t ps_auxinfo; /* User pointer to auxinfo */
vaddr_t ps_timekeep;/* User pointer to timekeep */
vaddr_t ps_sigcode; /* [I] User pointer to signal code */
vaddr_t ps_sigcoderet;  /* [I] User ptr to sigreturn retPC */



Re: relayd: uninitialized errstr

2022-11-09 Thread Moritz Buhl
On Thu, Nov 10, 2022 at 01:29:13AM +0100, Theo Buehler wrote:
> On Thu, Nov 10, 2022 at 01:10:51AM +0100, Moritz Buhl wrote:
> > errstr is never set but fail does:
> > RSA_meth_free(rsae_method);
> > fatalx("%s: %s", __func__, errstr);
> > Found by codechecker.
> > 
> > OK?
> 
> Needs more braces

Indeed.
Is this OK?

Index: ca.c
===
RCS file: /cvs/src/usr.sbin/relayd/ca.c,v
retrieving revision 1.39
diff -u -p -r1.39 ca.c
--- ca.c20 Jan 2022 17:56:35 -  1.39
+++ ca.c10 Nov 2022 00:33:35 -
@@ -513,8 +513,10 @@ ca_engine_init(struct relayd *x_env)
if (rsa_default != NULL)
return;
 
-   if ((rsae_method = RSA_meth_new("RSA privsep engine", 0)) == NULL)
+   if ((rsae_method = RSA_meth_new("RSA privsep engine", 0)) == NULL) {
+   errstr = "RSA_meth_new";
goto fail;
+   }
 
RSA_meth_set_pub_enc(rsae_method, rsae_pub_enc);
RSA_meth_set_pub_dec(rsae_method, rsae_pub_dec);



relayd: uninitialized errstr

2022-11-09 Thread Moritz Buhl
errstr is never set but fail does:
RSA_meth_free(rsae_method);
fatalx("%s: %s", __func__, errstr);
Found by codechecker.

OK?
mbuhl

Index: ca.c
===
RCS file: /cvs/src/usr.sbin/relayd/ca.c,v
retrieving revision 1.39
diff -u -p -r1.39 ca.c
--- ca.c20 Jan 2022 17:56:35 -  1.39
+++ ca.c10 Nov 2022 00:06:20 -
@@ -514,6 +514,7 @@ ca_engine_init(struct relayd *x_env)
return;
 
if ((rsae_method = RSA_meth_new("RSA privsep engine", 0)) == NULL)
+   errstr = "RSA_meth_new";
goto fail;
 
RSA_meth_set_pub_enc(rsae_method, rsae_pub_enc);



relayd: always call va_end

2022-11-09 Thread Moritz Buhl
The same code is in httpd but there it was fixed in 

commit 6b535b529336a3fd1beb56c42ff5755b84ba9b03
Author: jung 
Date:   Sun May 22 19:19:21 2016 +

fix unbalanced va_start and va_end macros

from Hiltjo Posthuma

"do." deraadt

Found by codechecker

OK?
mbuhl

Index: usr.sbin/relayd/relayd.c
===
RCS file: /cvs/src/usr.sbin/relayd/relayd.c,v
retrieving revision 1.189
diff -u -p -r1.189 relayd.c
--- usr.sbin/relayd/relayd.c3 Sep 2022 20:07:31 -   1.189
+++ usr.sbin/relayd/relayd.c9 Nov 2022 23:51:57 -
@@ -656,11 +656,13 @@ kv_set(struct kv *kv, char *fmt, ...)
va_list   ap;
char*value = NULL;
struct kv   *ckv;
+   int  ret;
 
va_start(ap, fmt);
-   if (vasprintf(, fmt, ap) == -1)
-   return (-1);
+   ret = vasprintf(, fmt, ap);
va_end(ap);
+   if (ret == -1)
+   return (-1);
 
/* Remove all children */
while ((ckv = TAILQ_FIRST(>kv_children)) != NULL) {
@@ -681,11 +683,13 @@ kv_setkey(struct kv *kv, char *fmt, ...)
 {
va_list  ap;
char*key = NULL;
+   int  ret;
 
va_start(ap, fmt);
-   if (vasprintf(, fmt, ap) == -1)
-   return (-1);
+   ret = vasprintf(, fmt, ap);
va_end(ap);
+   if (ret == -1)
+   return (-1);
 
free(kv->kv_key);
kv->kv_key = key;



acme-client memory leak in error case

2022-11-09 Thread Moritz Buhl
Hi tech,

g is not freed in this error case.
Found by codechecker.

OK?
mbuhl

Index: usr.sbin/acme-client/netproc.c
===
RCS file: /cvs/src/usr.sbin/acme-client/netproc.c,v
retrieving revision 1.31
diff -u -p -r1.31 netproc.c
--- usr.sbin/acme-client/netproc.c  24 Aug 2021 10:07:30 -  1.31
+++ usr.sbin/acme-client/netproc.c  9 Nov 2022 18:11:17 -
@@ -222,6 +222,7 @@ again:
if ((st = http_head_get("Location", g->head, g->headsz)) ==
NULL) {
warnx("redirect without location header");
+   http_get_free(g);
return -1;
}
 



apm garbage stack value could be accessed

2022-11-09 Thread Moritz Buhl
Hi tech,

In case send or recv fail in send_command, reply.error is stack
garbage.  This wouldn't be possible if reply was zeroed.  But
checking for ret == 0 ensures that reply was fully written.

OK?
mbuhl

Index: usr.sbin/apm/apm.c
===
RCS file: /cvs/src/usr.sbin/apm/apm.c,v
retrieving revision 1.42
diff -u -p -r1.42 apm.c
--- usr.sbin/apm/apm.c  10 Sep 2022 10:10:09 -  1.42
+++ usr.sbin/apm/apm.c  9 Nov 2022 17:29:03 -
@@ -99,6 +99,8 @@ do_zzz(int fd, enum apm_action action)
char *msg;
int ret;
 
+   bzero(, sizeof reply);
+
switch (action) {
case NONE:
case SUSPEND:
@@ -119,7 +121,7 @@ do_zzz(int fd, enum apm_action action)
 
printf("%s...\n", msg);
ret = send_command(fd, , );
-   if (reply.error)
+   if (ret == 0 && reply.error)
errx(1, "%s: %s", apm_state(reply.newstate), 
strerror(reply.error));
exit(ret);
 }



libcrypto potential memory leak in OBJ_NAME_add error path

2022-11-08 Thread Moritz Buhl
In case lh_OBJ_NAME_insert returns NULL due to a failed malloc, onp
is leaked in OBJ_NAME_add.

Found by CodeChecker.

OK?
mbuhl

Index: lib/libcrypto/objects/o_names.c
===
RCS file: /cvs/src/lib/libcrypto/objects/o_names.c,v
retrieving revision 1.22
diff -u -p -r1.22 o_names.c
--- lib/libcrypto/objects/o_names.c 29 Jan 2017 17:49:23 -  1.22
+++ lib/libcrypto/objects/o_names.c 8 Nov 2022 12:24:47 -
@@ -196,6 +196,7 @@ OBJ_NAME_add(const char *name, int type,
}
free(ret);
} else {
+   free(onp);
if (lh_OBJ_NAME_error(names_lh)) {
/* ERROR */
return (0);



Re: mbufs growing in 7.2

2022-11-08 Thread Moritz Buhl
Hi Greg, Hi Joe,

dlg@ hinted to me that the ring might overwrite it's own starting
position with the current code.

Does this help?
mbuhl

Index: dev/pci/if_igc.c
===
RCS file: /cvs/src/sys/dev/pci/if_igc.c,v
retrieving revision 1.9
diff -u -p -r1.9 if_igc.c
--- dev/pci/if_igc.c2 Jun 2022 07:41:17 -   1.9
+++ dev/pci/if_igc.c8 Nov 2022 10:35:39 -
@@ -978,7 +978,7 @@ igc_start(struct ifqueue *ifq)
mask = sc->num_tx_desc - 1;
 
for (;;) {
-   if (free <= IGC_MAX_SCATTER) {
+   if (free <= IGC_MAX_SCATTER + 1) {
ifq_set_oactive(ifq);
break;
}
@@ -1005,6 +1005,7 @@ igc_start(struct ifqueue *ifq)
/* Consume the first descriptor */
prod++;
prod &= mask;
+   free--;
}
 
for (i = 0; i < map->dm_nsegs; i++) {



potential memory leak in bgpd rde_dump_ctx_new

2022-11-07 Thread Moritz Buhl
Hi tech,
Dear claudio,

ctx might leak due to a prefix/rib dump new/subtree failing in calloc
and then going to nomem in rde_dump_ctx_new.

I am wondering if a similar fix is missing in rde_dump_done after
the nomem label.

thoughts?
mbuhl

Found by CodeChecker.

Index: usr.sbin/bgpd/rde.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
retrieving revision 1.578
diff -u -p -r1.578 rde.c
--- usr.sbin/bgpd/rde.c 23 Sep 2022 15:49:20 -  1.578
+++ usr.sbin/bgpd/rde.c 7 Nov 2022 19:17:16 -
@@ -2739,6 +2739,7 @@ rde_dump_ctx_new(struct ctl_show_rib_req
error = CTL_RES_NOMEM;
imsg_compose(ibuf_se_ctl, IMSG_CTL_RESULT, 0, pid, -1, ,
sizeof(error));
+   free(ctx);
return;
}
 



bgpd and ldpd pfkey_reply might access uninitialized stack memory

2022-11-07 Thread Moritz Buhl
Hi tech,
Dear claudio,

there could be an uninitialized stack memory access in pfkey_reply.

It looks like this:
struct sadb_msg hdr, *msg;
...

do {
rv = pfkey_read(sd, );
if (rv == -1)
return (-1);
} while (rv);

if (hdr.sadb_msg_errno != 0) {

And in the errno cases of pfkey_read, it doesn't initialize hdr but
then breaks out of the do-while loop.

OK?
mbuhl

Found by CodeChecker.

Index: bgpd/pfkey.c
===
RCS file: /cvs/src/usr.sbin/bgpd/pfkey.c,v
retrieving revision 1.67
diff -u -p -r1.67 pfkey.c
--- bgpd/pfkey.c17 Aug 2022 15:15:26 -  1.67
+++ bgpd/pfkey.c7 Nov 2022 18:58:37 -
@@ -423,7 +423,7 @@ pfkey_read(int sd, struct sadb_msg *h)
 
if (recv(sd, , sizeof(hdr), MSG_PEEK) != sizeof(hdr)) {
if (errno == EAGAIN || errno == EINTR)
-   return (0);
+   return (1);
log_warn("pfkey peek");
return (-1);
}
@@ -439,7 +439,7 @@ pfkey_read(int sd, struct sadb_msg *h)
/* not ours, discard */
if (read(sd, , sizeof(hdr)) == -1) {
if (errno == EAGAIN || errno == EINTR)
-   return (0);
+   return (1);
log_warn("pfkey read");
return (-1);
}
Index: ldpd/pfkey.c
===
RCS file: /cvs/src/usr.sbin/ldpd/pfkey.c,v
retrieving revision 1.12
diff -u -p -r1.12 pfkey.c
--- ldpd/pfkey.c23 Jan 2019 02:02:04 -  1.12
+++ ldpd/pfkey.c7 Nov 2022 19:00:01 -
@@ -253,7 +253,7 @@ pfkey_read(int sd, struct sadb_msg *h)
 
if (recv(sd, , sizeof(hdr), MSG_PEEK) != sizeof(hdr)) {
if (errno == EAGAIN || errno == EINTR)
-   return (0);
+   return (1);
log_warn("pfkey peek");
return (-1);
}
@@ -269,7 +269,7 @@ pfkey_read(int sd, struct sadb_msg *h)
/* not ours, discard */
if (read(sd, , sizeof(hdr)) == -1) {
if (errno == EAGAIN || errno == EINTR)
-   return (0);
+   return (1);
log_warn("pfkey read");
return (-1);
}



Re: em(4) IPv4, TCP, UDP checksum offloading

2022-11-05 Thread Moritz Buhl
Looking for OKs.

On Sat, Oct 15, 2022 at 10:01:53PM +0200, Moritz Buhl wrote:
> With the previous diffs I am seeing sporadic connection problems in tcpbench
> via IPv6 on Intel 82546GB.
> The diff was too big anyways. Here is a smaller diff that introduces
> checksum offloading for the controllers that use the advanced descriptors.
> 
> I tested this diff on i350 and 82571EB, receive, send, forwarding,
> icmp4, icmp6, tcp4, tcp6, udp4, udp6, also over vlan and veb.
> 
> I am still looking for testing. Any feedback?
> 
> mbuhl
> 
> Index: dev/pci/if_em.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_em.c,v
> retrieving revision 1.362
> diff -u -p -r1.362 if_em.c
> --- dev/pci/if_em.c   23 Jun 2022 09:38:28 -  1.362
> +++ dev/pci/if_em.c   15 Oct 2022 19:49:12 -
> @@ -37,6 +37,8 @@ POSSIBILITY OF SUCH DAMAGE.
>  #include 
>  #include 
>  
> +#include 
> +
>  /*
>   *  Driver version
>   */
> @@ -278,6 +280,8 @@ void em_receive_checksum(struct em_softc
>struct mbuf *);
>  u_intem_transmit_checksum_setup(struct em_queue *, struct mbuf *, 
> u_int,
>   u_int32_t *, u_int32_t *);
> +u_intem_tx_ctx_setup(struct em_queue *, struct mbuf *, u_int, 
> u_int32_t *,
> + u_int32_t *);
>  void em_iff(struct em_softc *);
>  void em_update_link_status(struct em_softc *);
>  int  em_get_buf(struct em_queue *, int);
> @@ -1220,10 +1224,9 @@ em_encap(struct em_queue *que, struct mb
>   BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
>   }
>  
> - if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
> - sc->hw.mac_type != em_82576 &&
> - sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
> - sc->hw.mac_type != em_i350) {
> + if (sc->hw.mac_type >= em_82575 && sc->hw.mac_type <= em_i210) {
> + used += em_tx_ctx_setup(que, m, head, _upper, _lower);
> + } else if (sc->hw.mac_type >= em_82543) {
>   used += em_transmit_checksum_setup(que, m, head,
>   _upper, _lower);
>   } else {
> @@ -1278,7 +1281,8 @@ em_encap(struct em_queue *que, struct mb
>  
>  #if NVLAN > 0
>   /* Find out if we are in VLAN mode */
> - if (m->m_flags & M_VLANTAG) {
> + if (m->m_flags & M_VLANTAG && (sc->hw.mac_type < em_82575 ||
> + sc->hw.mac_type > em_i210)) {
>   /* Set the VLAN id */
>   desc->upper.fields.special = htole16(m->m_pkthdr.ether_vtag);
>  
> @@ -1964,17 +1968,16 @@ em_setup_interface(struct em_softc *sc)
>   ifp->if_capabilities = IFCAP_VLAN_MTU;
>  
>  #if NVLAN > 0
> - if (sc->hw.mac_type != em_82575 && sc->hw.mac_type != em_82580 &&
> - sc->hw.mac_type != em_82576 &&
> - sc->hw.mac_type != em_i210 && sc->hw.mac_type != em_i350)
> - ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING;
> + ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING;
>  #endif
>  
> - if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
> - sc->hw.mac_type != em_82576 &&
> - sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
> - sc->hw.mac_type != em_i350)
> + if (sc->hw.mac_type >= em_82543) {
>   ifp->if_capabilities |= IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4;
> + }
> + if (sc->hw.mac_type >= em_82575 && sc->hw.mac_type <= em_i210) {
> + ifp->if_capabilities |= IFCAP_CSUM_IPv4;
> + ifp->if_capabilities |= IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
> + }
>  
>   /* 
>* Specify the media types supported by this adapter and register
> @@ -2389,6 +2392,108 @@ em_free_transmit_structures(struct em_so
>   0, que->tx.sc_tx_dma.dma_map->dm_mapsize,
>   BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
>   }
> +}
> +
> +u_int
> +em_tx_ctx_setup(struct em_queue *que, struct mbuf *mp, u_int head,
> +u_int32_t *olinfo_status, u_int32_t *cmd_type_len)
> +{
> + struct e1000_adv_tx_context_desc *TD;
> + struct ether_header *eh = mtod(mp, struct ether_header *);
> + struct mbuf *m;
> + uint32_t vlan_macip_lens = 0, type_tucmd_mlhl = 0, mss_l4len_idx = 0;
> + int off = 0, 

Re: em(4) IPv4, TCP, UDP checksum offloading

2022-10-15 Thread Moritz Buhl
With the previous diffs I am seeing sporadic connection problems in tcpbench
via IPv6 on Intel 82546GB.
The diff was too big anyways. Here is a smaller diff that introduces
checksum offloading for the controllers that use the advanced descriptors.

I tested this diff on i350 and 82571EB, receive, send, forwarding,
icmp4, icmp6, tcp4, tcp6, udp4, udp6, also over vlan and veb.

I am still looking for testing. Any feedback?

mbuhl

Index: dev/pci/if_em.c
===
RCS file: /cvs/src/sys/dev/pci/if_em.c,v
retrieving revision 1.362
diff -u -p -r1.362 if_em.c
--- dev/pci/if_em.c 23 Jun 2022 09:38:28 -  1.362
+++ dev/pci/if_em.c 15 Oct 2022 19:49:12 -
@@ -37,6 +37,8 @@ POSSIBILITY OF SUCH DAMAGE.
 #include 
 #include 
 
+#include 
+
 /*
  *  Driver version
  */
@@ -278,6 +280,8 @@ void em_receive_checksum(struct em_softc
 struct mbuf *);
 u_int  em_transmit_checksum_setup(struct em_queue *, struct mbuf *, u_int,
u_int32_t *, u_int32_t *);
+u_int  em_tx_ctx_setup(struct em_queue *, struct mbuf *, u_int, u_int32_t *,
+   u_int32_t *);
 void em_iff(struct em_softc *);
 void em_update_link_status(struct em_softc *);
 int  em_get_buf(struct em_queue *, int);
@@ -1220,10 +1224,9 @@ em_encap(struct em_queue *que, struct mb
BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
}
 
-   if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350) {
+   if (sc->hw.mac_type >= em_82575 && sc->hw.mac_type <= em_i210) {
+   used += em_tx_ctx_setup(que, m, head, _upper, _lower);
+   } else if (sc->hw.mac_type >= em_82543) {
used += em_transmit_checksum_setup(que, m, head,
_upper, _lower);
} else {
@@ -1278,7 +1281,8 @@ em_encap(struct em_queue *que, struct mb
 
 #if NVLAN > 0
/* Find out if we are in VLAN mode */
-   if (m->m_flags & M_VLANTAG) {
+   if (m->m_flags & M_VLANTAG && (sc->hw.mac_type < em_82575 ||
+   sc->hw.mac_type > em_i210)) {
/* Set the VLAN id */
desc->upper.fields.special = htole16(m->m_pkthdr.ether_vtag);
 
@@ -1964,17 +1968,16 @@ em_setup_interface(struct em_softc *sc)
ifp->if_capabilities = IFCAP_VLAN_MTU;
 
 #if NVLAN > 0
-   if (sc->hw.mac_type != em_82575 && sc->hw.mac_type != em_82580 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_i210 && sc->hw.mac_type != em_i350)
-   ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING;
+   ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING;
 #endif
 
-   if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350)
+   if (sc->hw.mac_type >= em_82543) {
ifp->if_capabilities |= IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4;
+   }
+   if (sc->hw.mac_type >= em_82575 && sc->hw.mac_type <= em_i210) {
+   ifp->if_capabilities |= IFCAP_CSUM_IPv4;
+   ifp->if_capabilities |= IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
+   }
 
/* 
 * Specify the media types supported by this adapter and register
@@ -2389,6 +2392,108 @@ em_free_transmit_structures(struct em_so
0, que->tx.sc_tx_dma.dma_map->dm_mapsize,
BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
}
+}
+
+u_int
+em_tx_ctx_setup(struct em_queue *que, struct mbuf *mp, u_int head,
+u_int32_t *olinfo_status, u_int32_t *cmd_type_len)
+{
+   struct e1000_adv_tx_context_desc *TD;
+   struct ether_header *eh = mtod(mp, struct ether_header *);
+   struct mbuf *m;
+   uint32_t vlan_macip_lens = 0, type_tucmd_mlhl = 0, mss_l4len_idx = 0;
+   int off = 0, hoff;
+   uint8_t ipproto, iphlen;
+
+   *olinfo_status = 0;
+   *cmd_type_len = 0;
+   TD = (struct e1000_adv_tx_context_desc *)>tx.sc_tx_desc_ring[head];
+   
+#if NVLAN > 0
+   if (ISSET(mp->m_flags, M_VLANTAG)) {
+   uint16_t vtag = htole16(mp->m_pkthdr.ether_vtag);
+   vlan_macip_lens |= vtag << E1000_ADVTXD_VLAN_SHIFT;
+   *cmd_type_len |= E1000_ADVTXD_DCMD_VLE;
+   off = 1;
+   }
+#endif
+
+   vlan_macip_lens |= (sizeof(*eh) << E1000_ADVTXD_MACLEN_SHIFT);
+   
+   switch (ntohs(eh->ether_type)) {
+   case ETHERTYPE_IP: {
+   struct ip *ip;
+
+   m = m_getptr(mp, sizeof(*eh), );
+   ip = (struct ip *)(mtod(m, caddr_t) + hoff);
+
+   iphlen = ip->ip_hl << 

Re: em(4) IPv4, TCP, UDP checksum offloading

2022-10-12 Thread Moritz Buhl
Another mistake

On Wed, Oct 12, 2022 at 01:45:30PM +0200, Moritz Buhl wrote:
> On Tue, Oct 11, 2022 at 04:16:15PM +0100, Stuart Henderson wrote:
> > On 2022/10/11 15:03, Moritz Buhl wrote:
> > > Here is a new diff for checksum offloading (ipv4, udp, tcp) for em(4).
> > > 
> > > The previous diff didn't implement hardware vlan tagging for >em82578
> > > which should result in variable ethernet header lengths and thus
> > > wrong checksums inserted at wrong places.
> > > 
> > > The diff below addresses this.
> > > I would appreciate further testing reports with different controllers.
> > > 
> > > mbuhl
> > 
> > I tried this on my laptop which has I219-V em (I run it in a trunk
> > with iwm). It breaks tx (packets don't show up on the other side).
> > rx seems ok.
> 
> The following diff will restrict the usage of the advanced 
> descriptors to 82575, 82576, i350 and i210, and fix what the
> last diff broke for i219.
> 
> Index: dev/pci/if_em.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_em.c,v
> retrieving revision 1.362
> diff -u -p -r1.362 if_em.c
> --- dev/pci/if_em.c   23 Jun 2022 09:38:28 -  1.362
> +++ dev/pci/if_em.c   11 Oct 2022 16:05:43 -
> @@ -37,6 +37,8 @@ POSSIBILITY OF SUCH DAMAGE.
>  #include 
>  #include 
>  
> +#include 
> +
>  /*
>   *  Driver version
>   */
> @@ -278,6 +280,8 @@ void em_receive_checksum(struct em_softc
>struct mbuf *);
>  u_intem_transmit_checksum_setup(struct em_queue *, struct mbuf *, 
> u_int,
>   u_int32_t *, u_int32_t *);
> +u_intem_tx_ctx_setup(struct em_queue *, struct mbuf *, u_int, 
> u_int32_t *,
> + u_int32_t *);
>  void em_iff(struct em_softc *);
>  void em_update_link_status(struct em_softc *);
>  int  em_get_buf(struct em_queue *, int);
> @@ -1220,10 +1224,9 @@ em_encap(struct em_queue *que, struct mb
>   BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
>   }
>  
> - if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
> - sc->hw.mac_type != em_82576 &&
> - sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
> - sc->hw.mac_type != em_i350) {
> + if (sc->hw.mac_type >= em_82575 && sc->hw.mac_type <= em_i210) {
> + used += em_tx_ctx_setup(que, m, head, _upper, _lower);
> + } else if (sc->hw.mac_type >= em_82543) {
>   used += em_transmit_checksum_setup(que, m, head,
>   _upper, _lower);
>   } else {
> @@ -1278,7 +1281,7 @@ em_encap(struct em_queue *que, struct mb
>  
>  #if NVLAN > 0
>   /* Find out if we are in VLAN mode */
> - if (m->m_flags & M_VLANTAG) {
> + if (m->m_flags & M_VLANTAG && sc->hw.mac_type < em_82575) {

I missed this spot, it should be
if (m->m_flags & M_VLANTAG && (sc->hw.mac_type < em_82575 ||
sc->hw.mac_type > em_i210)) {




Re: em(4) IPv4, TCP, UDP checksum offloading

2022-10-12 Thread Moritz Buhl
On Tue, Oct 11, 2022 at 04:16:15PM +0100, Stuart Henderson wrote:
> On 2022/10/11 15:03, Moritz Buhl wrote:
> > Here is a new diff for checksum offloading (ipv4, udp, tcp) for em(4).
> > 
> > The previous diff didn't implement hardware vlan tagging for >em82578
> > which should result in variable ethernet header lengths and thus
> > wrong checksums inserted at wrong places.
> > 
> > The diff below addresses this.
> > I would appreciate further testing reports with different controllers.
> > 
> > mbuhl
> 
> I tried this on my laptop which has I219-V em (I run it in a trunk
> with iwm). It breaks tx (packets don't show up on the other side).
> rx seems ok.

The following diff will restrict the usage of the advanced 
descriptors to 82575, 82576, i350 and i210, and fix what the
last diff broke for i219.

Index: dev/pci/if_em.c
===
RCS file: /cvs/src/sys/dev/pci/if_em.c,v
retrieving revision 1.362
diff -u -p -r1.362 if_em.c
--- dev/pci/if_em.c 23 Jun 2022 09:38:28 -  1.362
+++ dev/pci/if_em.c 11 Oct 2022 16:05:43 -
@@ -37,6 +37,8 @@ POSSIBILITY OF SUCH DAMAGE.
 #include 
 #include 
 
+#include 
+
 /*
  *  Driver version
  */
@@ -278,6 +280,8 @@ void em_receive_checksum(struct em_softc
 struct mbuf *);
 u_int  em_transmit_checksum_setup(struct em_queue *, struct mbuf *, u_int,
u_int32_t *, u_int32_t *);
+u_int  em_tx_ctx_setup(struct em_queue *, struct mbuf *, u_int, u_int32_t *,
+   u_int32_t *);
 void em_iff(struct em_softc *);
 void em_update_link_status(struct em_softc *);
 int  em_get_buf(struct em_queue *, int);
@@ -1220,10 +1224,9 @@ em_encap(struct em_queue *que, struct mb
BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
}
 
-   if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350) {
+   if (sc->hw.mac_type >= em_82575 && sc->hw.mac_type <= em_i210) {
+   used += em_tx_ctx_setup(que, m, head, _upper, _lower);
+   } else if (sc->hw.mac_type >= em_82543) {
used += em_transmit_checksum_setup(que, m, head,
_upper, _lower);
} else {
@@ -1278,7 +1281,7 @@ em_encap(struct em_queue *que, struct mb
 
 #if NVLAN > 0
/* Find out if we are in VLAN mode */
-   if (m->m_flags & M_VLANTAG) {
+   if (m->m_flags & M_VLANTAG && sc->hw.mac_type < em_82575) {
/* Set the VLAN id */
desc->upper.fields.special = htole16(m->m_pkthdr.ether_vtag);
 
@@ -1964,17 +1967,14 @@ em_setup_interface(struct em_softc *sc)
ifp->if_capabilities = IFCAP_VLAN_MTU;
 
 #if NVLAN > 0
-   if (sc->hw.mac_type != em_82575 && sc->hw.mac_type != em_82580 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_i210 && sc->hw.mac_type != em_i350)
-   ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING;
+   ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING;
 #endif
 
-   if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350)
+   if (sc->hw.mac_type >= em_82543) {
+   ifp->if_capabilities |= IFCAP_CSUM_IPv4;
ifp->if_capabilities |= IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4;
+   ifp->if_capabilities |= IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
+   }
 
/* 
 * Specify the media types supported by this adapter and register
@@ -2391,6 +2391,108 @@ em_free_transmit_structures(struct em_so
}
 }
 
+u_int
+em_tx_ctx_setup(struct em_queue *que, struct mbuf *mp, u_int head,
+u_int32_t *olinfo_status, u_int32_t *cmd_type_len)
+{
+   struct e1000_adv_tx_context_desc *TD;
+   struct ether_header *eh = mtod(mp, struct ether_header *);
+   struct mbuf *m;
+   uint32_t vlan_macip_lens = 0, type_tucmd_mlhl = 0, mss_l4len_idx = 0;
+   int off = 0, hoff;
+   uint8_t ipproto, iphlen;
+
+   *olinfo_status = 0;
+   *cmd_type_len = 0;
+   TD = (struct e1000_adv_tx_context_desc *)>tx.sc_tx_desc_ring[head];
+   
+#if NVLAN > 0
+   if (ISSET(mp->m_flags, M_VLANTAG)) {
+   uint16_t vtag = htole16(mp->m_pkthdr.ether_vtag);
+   vlan

em(4) IPv4, TCP, UDP checksum offloading

2022-10-11 Thread Moritz Buhl
Here is a new diff for checksum offloading (ipv4, udp, tcp) for em(4).

The previous diff didn't implement hardware vlan tagging for >em82578
which should result in variable ethernet header lengths and thus
wrong checksums inserted at wrong places.

The diff below addresses this.
I would appreciate further testing reports with different controllers.

mbuhl

Index: dev/pci/if_em.c
===
RCS file: /mount/openbsd/cvs/src/sys/dev/pci/if_em.c,v
retrieving revision 1.362
diff -u -p -r1.362 if_em.c
--- dev/pci/if_em.c 23 Jun 2022 09:38:28 -  1.362
+++ dev/pci/if_em.c 10 Oct 2022 18:01:19 -
@@ -37,6 +37,8 @@ POSSIBILITY OF SUCH DAMAGE.
 #include 
 #include 
 
+#include 
+
 /*
  *  Driver version
  */
@@ -278,6 +280,8 @@ void em_receive_checksum(struct em_softc
 struct mbuf *);
 u_int  em_transmit_checksum_setup(struct em_queue *, struct mbuf *, u_int,
u_int32_t *, u_int32_t *);
+u_int  em_tx_ctx_setup(struct em_queue *, struct mbuf *, u_int, u_int32_t *,
+   u_int32_t *);
 void em_iff(struct em_softc *);
 void em_update_link_status(struct em_softc *);
 int  em_get_buf(struct em_queue *, int);
@@ -1220,10 +1224,9 @@ em_encap(struct em_queue *que, struct mb
BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
}
 
-   if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350) {
+   if (sc->hw.mac_type >= em_82575) {
+   used += em_tx_ctx_setup(que, m, head, _upper, _lower);
+   } else if (sc->hw.mac_type >= em_82543) {
used += em_transmit_checksum_setup(que, m, head,
_upper, _lower);
} else {
@@ -1278,7 +1281,7 @@ em_encap(struct em_queue *que, struct mb
 
 #if NVLAN > 0
/* Find out if we are in VLAN mode */
-   if (m->m_flags & M_VLANTAG) {
+   if (m->m_flags & M_VLANTAG && sc->hw.mac_type < em_82575) {
/* Set the VLAN id */
desc->upper.fields.special = htole16(m->m_pkthdr.ether_vtag);
 
@@ -1964,17 +1967,14 @@ em_setup_interface(struct em_softc *sc)
ifp->if_capabilities = IFCAP_VLAN_MTU;
 
 #if NVLAN > 0
-   if (sc->hw.mac_type != em_82575 && sc->hw.mac_type != em_82580 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_i210 && sc->hw.mac_type != em_i350)
-   ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING;
+   ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING;
 #endif
 
-   if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350)
+   if (sc->hw.mac_type >= em_82543) {
+   ifp->if_capabilities |= IFCAP_CSUM_IPv4;
ifp->if_capabilities |= IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4;
+   ifp->if_capabilities |= IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
+   }
 
/* 
 * Specify the media types supported by this adapter and register
@@ -2391,6 +2391,108 @@ em_free_transmit_structures(struct em_so
}
 }
 
+u_int
+em_tx_ctx_setup(struct em_queue *que, struct mbuf *mp, u_int head,
+u_int32_t *olinfo_status, u_int32_t *cmd_type_len)
+{
+   struct e1000_adv_tx_context_desc *TD;
+   struct ether_header *eh = mtod(mp, struct ether_header *);
+   struct mbuf *m;
+   uint32_t vlan_macip_lens = 0, type_tucmd_mlhl = 0, mss_l4len_idx = 0;
+   int off = 0, hoff;
+   uint8_t ipproto, iphlen;
+
+   *olinfo_status = 0;
+   *cmd_type_len = 0;
+   TD = (struct e1000_adv_tx_context_desc *)>tx.sc_tx_desc_ring[head];
+   
+#if NVLAN > 0
+   if (ISSET(mp->m_flags, M_VLANTAG)) {
+   uint16_t vtag = htole16(mp->m_pkthdr.ether_vtag);
+   vlan_macip_lens |= vtag << E1000_ADVTXD_VLAN_SHIFT;
+   *cmd_type_len |= E1000_ADVTXD_DCMD_VLE;
+   off = 1;
+   }
+#endif
+
+   vlan_macip_lens |= (sizeof(*eh) << E1000_ADVTXD_MACLEN_SHIFT);
+   
+   switch (ntohs(eh->ether_type)) {
+   case ETHERTYPE_IP: {
+   struct ip *ip;
+
+   m = m_getptr(mp, sizeof(*eh), );
+   ip = (struct ip *)(mtod(m, caddr_t) + hoff);
+
+   iphlen = ip->ip_hl << 2;
+   ipproto = ip->ip_p;
+
+   type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_IPV4;
+   if (ISSET(mp->m_pkthdr.csum_flags, M_IPV4_CSUM_OUT)) {
+   *olinfo_status |= E1000_TXD_POPTS_IXSM << 8;
+   off = 1;
+   }
+
+   break;
+   }
+#ifdef INET6
+   

Re: semctl, semget and semop possible panics

2022-09-22 Thread Moritz Buhl
On Fri, Sep 16, 2022 at 11:38:49PM +0200, Moritz Buhl wrote:
> Syzkaller found a panic with sysv semaphores:
> https://syzkaller.appspot.com/bug?id=f7e8e2822779918d7a23d9ff9d7c0a3779c00a46

The last diff would introduced two memory leaks, seminfo.semopm is
the only seminfo value that can shrink via sysctl, so we have to
recheck it.  semtot also can change during a sleep.
These problems are now addressed below.

I am still uncertain if anything relevant could change during a
sleep in copyout for GETALL.

The diff is growing in size, would it make sense to look at semctl
separately from semop and semget?

Except for the fuzzer, I am not aware of any code that makes extensive
use of the related syscalls.  But they are cute.

mbuhl

Index: kern/sysv_sem.c
===
RCS file: /cvs/src/sys/kern/sysv_sem.c,v
retrieving revision 1.62
diff -u -p -r1.62 sysv_sem.c
--- kern/sysv_sem.c 16 Sep 2022 15:57:23 -  1.62
+++ kern/sysv_sem.c 17 Sep 2022 15:12:32 -
@@ -236,7 +236,7 @@ sys___semctl(struct proc *p, void *v, re
union semun arg, *uarg = SCARG(uap, arg);
struct semid_ds sbuf;
struct semid_ds *semaptr;
-   unsigned short *semval = NULL;
+   unsigned short *semval = NULL, nsems;
int i, ix, error;
 
switch (cmd) {
@@ -248,6 +248,9 @@ sys___semctl(struct proc *p, void *v, re
if ((error = copyin(uarg, , sizeof(union semun
return (error);
}
+   if (cmd == IPC_SET)
+   if ((error = copyin(arg.buf, , sizeof(sbuf
+   return (error);
 
DPRINTF(("call to semctl(%d, %d, %d, %p)\n", semid, semnum, cmd, uarg));
 
@@ -255,6 +258,7 @@ sys___semctl(struct proc *p, void *v, re
if (ix < 0 || ix >= seminfo.semmni)
return (EINVAL);
 
+again:
if ((semaptr = sema[ix]) == NULL ||
semaptr->sem_perm.seq != IPCID_TO_SEQ(semid))
return (EINVAL);
@@ -277,8 +281,6 @@ sys___semctl(struct proc *p, void *v, re
case IPC_SET:
if ((error = ipcperm(cred, >sem_perm, IPC_M)))
return (error);
-   if ((error = copyin(arg.buf, , sizeof(sbuf))) != 0)
-   return (error);
semaptr->sem_perm.uid = sbuf.sem_perm.uid;
semaptr->sem_perm.gid = sbuf.sem_perm.gid;
semaptr->sem_perm.mode = (semaptr->sem_perm.mode & ~0777) |
@@ -319,11 +321,22 @@ sys___semctl(struct proc *p, void *v, re
break;
 
case GETALL:
+   nsems = semaptr->sem_nsems;
+   semval = mallocarray(nsems, sizeof(arg.array[0]),
+   M_TEMP, M_WAITOK);
+   if (semaptr != sema[ix] || 
+   semaptr->sem_perm.seq != IPCID_TO_SEQ(semid) ||
+   semaptr->sem_nsems != nsems) {
+   free(semval, M_TEMP, nsems * sizeof(arg.array[0]));
+   goto again;
+   }
if ((error = ipcperm(cred, >sem_perm, IPC_R)))
-   return (error);
-   for (i = 0; i < semaptr->sem_nsems; i++) {
-   error = copyout(>sem_base[i].semval,
-   [i], sizeof(arg.array[0]));
+   goto error;
+   for (i = 0; i < nsems; i++)
+   semval[i] = semaptr->sem_base[i].semval;
+   for (i = 0; i < nsems; i++) {
+   error = copyout([i], [i],
+   sizeof(arg.array[0]));
if (error != 0)
break;
}
@@ -350,11 +363,18 @@ sys___semctl(struct proc *p, void *v, re
break;
 
case SETALL:
-   if ((error = ipcperm(cred, >sem_perm, IPC_W)))
-   return (error);
+   nsems = semaptr->sem_nsems;
semval = mallocarray(semaptr->sem_nsems, sizeof(arg.array[0]),
M_TEMP, M_WAITOK);
-   for (i = 0; i < semaptr->sem_nsems; i++) {
+   if (semaptr != sema[ix] ||
+   semaptr->sem_perm.seq != IPCID_TO_SEQ(semid) ||
+   semaptr->sem_nsems != nsems) {
+   free(semval, M_TEMP, nsems * sizeof(arg.array[0]));
+   goto again;
+   }
+   if ((error = ipcperm(cred, >sem_perm, IPC_W)))
+   goto error;
+   for (i = 0; i < nsems; i++) {
error = copyin([i], [i],
sizeof(arg.array[0]));
if (error != 0)
@@ -364,7 +384,15 @@ sys___semctl(struct proc *p, void *v, re
goto error;
}

semctl, semget and semop possible panics

2022-09-16 Thread Moritz Buhl
Syzkaller found a panic with sysv semaphores:
https://syzkaller.appspot.com/bug?id=f7e8e2822779918d7a23d9ff9d7c0a3779c00a46

The problem is that the code uses a few globals (sema, semtot,
seminfo) that can change during sleeping points where the kernel
lock is released.
Usually copyin, copyout, malloc and pool_get (with WAIT_OK).
It is especially bad with sema, whihc is a list of semaphore pointers.

The crash syzkaller found is likely this:
Another process removes the current semaphore while the initial
process is sleeping in one of the modification commands. Then the
process wakes up and writes to the unassigned pool region and the
next pool_get call will notice this and panic.

To fix this, I move the sleeping points up and move the icpperm
checks after the sleeping points.  For the theoretical case, that
the sleep of the modifying command takes longer than removing and
reallocationg the semaphore, the permissions must be rechecked. And
if the number of semaphores changed, we better start over too.

To confirm that the attached diff fixes the crash, instructions on
running syzkaller reproducers follow.

pkg_add gmake go git
git clone https://github.com/google/syzkaller/
cd syzkaller && gmake
syzkaller_home=$PWD
cd /tmp && ln -s $syzkaller_home/bin/openbsd_amd64/syz-* .
mount -u -o ro /

./syz-execprog -repeat 1000 -procs 4 repro.syz

where repro.syz is
ftp -o repro.syz \
 'https://syzkaller.appspot.com/text?tag=ReproSyz=175f4ff208'

kind regards,
mbuhl

Index: kern/sysv_sem.c
===
RCS file: /cvs/src/sys/kern/sysv_sem.c,v
retrieving revision 1.62
diff -u -p -r1.62 sysv_sem.c
--- kern/sysv_sem.c 16 Sep 2022 15:57:23 -  1.62
+++ kern/sysv_sem.c 16 Sep 2022 21:29:36 -
@@ -236,7 +236,7 @@ sys___semctl(struct proc *p, void *v, re
union semun arg, *uarg = SCARG(uap, arg);
struct semid_ds sbuf;
struct semid_ds *semaptr;
-   unsigned short *semval = NULL;
+   unsigned short *semval = NULL, nsems;
int i, ix, error;
 
switch (cmd) {
@@ -248,6 +248,9 @@ sys___semctl(struct proc *p, void *v, re
if ((error = copyin(uarg, , sizeof(union semun
return (error);
}
+   if (cmd == IPC_SET)
+   if ((error = copyin(arg.buf, , sizeof(sbuf
+   return (error);
 
DPRINTF(("call to semctl(%d, %d, %d, %p)\n", semid, semnum, cmd, uarg));
 
@@ -255,6 +258,7 @@ sys___semctl(struct proc *p, void *v, re
if (ix < 0 || ix >= seminfo.semmni)
return (EINVAL);
 
+again:
if ((semaptr = sema[ix]) == NULL ||
semaptr->sem_perm.seq != IPCID_TO_SEQ(semid))
return (EINVAL);
@@ -277,8 +281,6 @@ sys___semctl(struct proc *p, void *v, re
case IPC_SET:
if ((error = ipcperm(cred, >sem_perm, IPC_M)))
return (error);
-   if ((error = copyin(arg.buf, , sizeof(sbuf))) != 0)
-   return (error);
semaptr->sem_perm.uid = sbuf.sem_perm.uid;
semaptr->sem_perm.gid = sbuf.sem_perm.gid;
semaptr->sem_perm.mode = (semaptr->sem_perm.mode & ~0777) |
@@ -319,11 +321,20 @@ sys___semctl(struct proc *p, void *v, re
break;
 
case GETALL:
+   nsems = semaptr->sem_nsems;
+   semval = mallocarray(nsems, sizeof(arg.array[0]),
+   M_TEMP, M_WAITOK);
+   if (semaptr != sema[ix] || semaptr->sem_nsems != nsems) {
+   free(semval, M_TEMP, nsems * sizeof(arg.array[0]));
+   goto again;
+   }
if ((error = ipcperm(cred, >sem_perm, IPC_R)))
-   return (error);
-   for (i = 0; i < semaptr->sem_nsems; i++) {
-   error = copyout(>sem_base[i].semval,
-   [i], sizeof(arg.array[0]));
+   goto error;
+   for (i = 0; i < nsems; i++)
+   semval[i] = semaptr->sem_base[i].semval;
+   for (i = 0; i < nsems; i++) {
+   error = copyout([i], [i],
+   sizeof(arg.array[0]));
if (error != 0)
break;
}
@@ -350,11 +361,16 @@ sys___semctl(struct proc *p, void *v, re
break;
 
case SETALL:
-   if ((error = ipcperm(cred, >sem_perm, IPC_W)))
-   return (error);
+   nsems = semaptr->sem_nsems;
semval = mallocarray(semaptr->sem_nsems, sizeof(arg.array[0]),
M_TEMP, M_WAITOK);
-   for (i = 0; i < semaptr->sem_nsems; i++) {
+   if (semaptr != sema[ix] || semaptr->sem_nsems != nsems) {
+   free(semval, M_TEMP, nsems * sizeof(arg.array[0]));
+   

Re: add sendmmsg and recvmmsg systemcalls

2022-09-08 Thread Moritz Buhl
On Wed, Aug 31, 2022 at 05:44:31PM -0900, Philip Guenther wrote:
> kdump.c will need at least a SYS_recvmmsg line in the big table, and if you
> do a ktrmmsghdr() bit in the kernel a matching decoder will be needed in
> kdump.

Here is a new diff for kdump.
OK?

mbuhl

Index: usr.bin/kdump/kdump.c
===
RCS file: /cvs/src/usr.bin/kdump/kdump.c,v
retrieving revision 1.149
diff -u -p -r1.149 kdump.c
--- usr.bin/kdump/kdump.c   20 Jul 2022 05:56:36 -  1.149
+++ usr.bin/kdump/kdump.c   7 Sep 2022 11:14:19 -
@@ -720,6 +720,8 @@ static const formatter scargs[][8] = {
 [SYS_ptrace]   = { Ptracedecode, Ppid_t, Pptr, Pdecint },
 [SYS_recvmsg]  = { Pfd, Pptr, Sendrecvflagsname },
 [SYS_sendmsg]  = { Pfd, Pptr, Sendrecvflagsname },
+[SYS_recvmmsg] = { Pfd, Pptr, Pucount, Sendrecvflagsname, Pptr },
+[SYS_sendmmsg] = { Pfd, Pptr, Pucount, Sendrecvflagsname },
 [SYS_recvfrom] = { Pfd, Pptr, Pbigsize, Sendrecvflagsname },
 [SYS_accept]   = { Pfd, Pptr, Pptr },
 [SYS_getpeername]  = { Pfd, Pptr, Pptr },
Index: usr.bin/kdump/ktrstruct.c
===
RCS file: /cvs/src/usr.bin/kdump/ktrstruct.c,v
retrieving revision 1.29
diff -u -p -r1.29 ktrstruct.c
--- usr.bin/kdump/ktrstruct.c   21 Dec 2020 07:47:37 -  1.29
+++ usr.bin/kdump/ktrstruct.c   7 Sep 2022 10:00:23 -
@@ -398,6 +398,18 @@ ktrquota(const struct dqblk *quota)
 }
 
 static void
+ktrmmsghdr(const struct mmsghdr *mmsg)
+{
+   printf("struct mmsghdr { msg_hdr = { name=%p, namelen=%u, "
+   "iov=%p, iovlen=%u, control=%p, controllen=%u, flags=",
+   mmsg->msg_hdr.msg_name, mmsg->msg_hdr.msg_namelen,
+   mmsg->msg_hdr.msg_iov, mmsg->msg_hdr.msg_iovlen,
+   mmsg->msg_hdr.msg_control, mmsg->msg_hdr.msg_controllen);
+   sendrecvflagsname(mmsg->msg_hdr.msg_flags);
+   printf(" }, msg_len = %u }\n", mmsg->msg_len);
+}
+
+static void
 ktrmsghdr(const struct msghdr *msg)
 {
printf("struct msghdr { name=%p, namelen=%u, iov=%p, iovlen=%u,"
@@ -649,6 +661,13 @@ ktrstruct(char *buf, size_t buflen)
goto invalid;
memcpy(, data, datalen);
ktrmsghdr();
+   } else if (strcmp(name, "mmsghdr") == 0) {
+   struct mmsghdr mmsg;
+
+   if (datalen != sizeof(mmsg))
+   goto invalid;
+   memcpy(, data, datalen);
+   ktrmmsghdr();
} else if (strcmp(name, "iovec") == 0) {
if (datalen % sizeof(struct iovec))
goto invalid;



Re: add sendmmsg and recvmmsg systemcalls

2022-09-06 Thread Moritz Buhl
On Tue, Sep 06, 2022 at 04:00:39PM +0200, Moritz Buhl wrote:
> Hi,
> here is the most recent diff for the libc part of send and recvmmsg.
> This requires a libc minor bump and therefore should be coordinated
> after snapshots are building normally again.
> 
> To my understanding the minor bump itself should not cause problems
> in ports anymore.

miod reminded me to also bump librthread as stated in libc/shlib_version.

Index: lib/libc/Symbols.list
===
RCS file: /cvs/src/lib/libc/Symbols.list,v
retrieving revision 1.75
diff -u -p -r1.75 Symbols.list
--- lib/libc/Symbols.list   2 Aug 2022 16:45:00 -   1.75
+++ lib/libc/Symbols.list   6 Sep 2022 09:36:40 -
@@ -175,6 +175,7 @@ _thread_sys_readlinkat
 _thread_sys_readv
 _thread_sys_reboot
 _thread_sys_recvfrom
+_thread_sys_recvmmsg
 _thread_sys_recvmsg
 _thread_sys_rename
 _thread_sys_renameat
@@ -184,6 +185,7 @@ _thread_sys_sched_yield
 _thread_sys_select
 _thread_sys_semget
 _thread_sys_semop
+_thread_sys_sendmmsg
 _thread_sys_sendmsg
 _thread_sys_sendsyslog
 _thread_sys_sendto
@@ -372,6 +374,7 @@ readlinkat
 readv
 reboot
 recvfrom
+recvmmsg
 recvmsg
 rename
 renameat
@@ -383,6 +386,7 @@ select
 semctl
 semget
 semop
+sendmmsg
 sendmsg
 sendsyslog
 sendto
Index: lib/libc/shlib_version
===
RCS file: /cvs/src/lib/libc/shlib_version,v
retrieving revision 1.210
diff -u -p -r1.210 shlib_version
--- lib/libc/shlib_version  2 Jun 2021 07:29:03 -   1.210
+++ lib/libc/shlib_version  6 Sep 2022 13:42:09 -
@@ -1,4 +1,4 @@
 major=96
-minor=1
+minor=2
 # note: If changes were made to include/thread_private.h or if system calls
 # were added/changed then librthread/shlib_version must also be updated.
Index: lib/libc/hidden/sys/socket.h
===
RCS file: /cvs/src/lib/libc/hidden/sys/socket.h,v
retrieving revision 1.4
diff -u -p -r1.4 socket.h
--- lib/libc/hidden/sys/socket.h7 May 2016 19:05:22 -   1.4
+++ lib/libc/hidden/sys/socket.h6 Sep 2022 13:41:53 -
@@ -33,9 +33,11 @@ PROTO_NORMAL(listen);
 PROTO_NORMAL(recv);
 PROTO_CANCEL(recvfrom);
 PROTO_CANCEL(recvmsg);
+PROTO_CANCEL(recvmmsg);
 PROTO_NORMAL(send);
-PROTO_CANCEL(sendmsg);
 PROTO_CANCEL(sendto);
+PROTO_CANCEL(sendmsg);
+PROTO_CANCEL(sendmmsg);
 PROTO_NORMAL(setrtable);
 PROTO_NORMAL(setsockopt);
 PROTO_NORMAL(shutdown);
Index: lib/libc/sys/Makefile.inc
===
RCS file: /cvs/src/lib/libc/sys/Makefile.inc,v
retrieving revision 1.163
diff -u -p -r1.163 Makefile.inc
--- lib/libc/sys/Makefile.inc   17 Jul 2022 03:04:27 -  1.163
+++ lib/libc/sys/Makefile.inc   6 Sep 2022 13:41:53 -
@@ -34,8 +34,8 @@ CANCEL=   accept accept4 \
nanosleep \
open openat \
poll ppoll pread preadv pselect pwrite pwritev \
-   read readv recvfrom recvmsg \
-   select sendmsg sendto \
+   read readv recvfrom recvmsg recvmmsg \
+   select sendto sendmsg sendmmsg \
wait4 write writev
 SRCS+= ${CANCEL:%=w_%.c}
 
Index: lib/libc/sys/recv.2
===
RCS file: /cvs/src/lib/libc/sys/recv.2,v
retrieving revision 1.48
diff -u -p -r1.48 recv.2
--- lib/libc/sys/recv.2 21 Nov 2021 23:44:55 -  1.48
+++ lib/libc/sys/recv.2 6 Sep 2022 13:42:12 -
@@ -46,15 +46,35 @@
 .Fn recvfrom "int s" "void *buf" "size_t len" "int flags" "struct sockaddr 
*from" "socklen_t *fromlen"
 .Ft ssize_t
 .Fn recvmsg "int s" "struct msghdr *msg" "int flags"
+.Ft int
+.Fn recvmmsg "int s" "struct mmsghdr *mmsg" "unsigned int vlen" "int flags" 
"struct timespec *timeout"
 .Sh DESCRIPTION
-.Fn recvfrom
+.Fn recv ,
+.Fn recvfrom ,
+.Fn recvmsg ,
 and
-.Fn recvmsg
+.Fn recvmmsg
 are used to receive messages from a socket,
-.Fa s ,
-and may be used to receive
+.Fa s .
+.Fn recv
+is normally used only on a
+.Em connected
+socket (see
+.Xr connect 2 ).
+.Fn recvfrom ,
+.Fn recvmsg ,
+and
+.Fn recvmmsg
+may be used to receive
 data on a socket whether or not it is connection-oriented.
 .Pp
+.Fn recv
+is identical to
+.Fn recvfrom
+with a null
+.Fa from
+parameter.
+.Pp
 If
 .Fa from
 is non-null and the socket is not connection-oriented,
@@ -66,25 +86,6 @@ the buffer associated with
 and modified on return to indicate the actual size of the
 address stored there.
 .Pp
-The
-.Fn recv
-call is normally used only on a
-.Em connected
-socket (see
-.Xr connect 2 )
-and is identical to
-.Fn recvfrom
-with a null
-.Fa from
-parameter.
-.Pp
-On successful completion, all three routines return the number of
-message bytes read.
-If a message is too long to fit in the supplied
-buffer, ex

Re: add sendmmsg and recvmmsg systemcalls

2022-09-06 Thread Moritz Buhl
 receive call waits for a message to arrive, unless
 the socket is nonblocking (see
@@ -158,6 +159,8 @@ The
 .Dv MSG_CMSG_CLOEXEC
 requests that any file descriptors received as ancillary data with
 .Fn recvmsg
+and
+.Fn recvmmsg
 (see below)
 have their close-on-exec flag set.
 .Pp
@@ -249,13 +252,67 @@ Indicates that the packet was received a
 .It Dv MSG_MCAST
 Indicates that the packet was received as multicast.
 .El
+.Pp
+The
+.Fn recvmmsg
+call uses an array of the
+.Fa mmsghdr
+structure of length
+.Fa vlen
+to group multiple
+.Fa msghdr
+structures into a single system call.
+.Fa vlen
+is capped at maximum
+.Dv 1024
+messages that are received in a single call.
+The
+.Fa flags
+field allows setting
+.Dv MSG_WAITFORONE
+to wait for one
+.Fa msghdr ,
+and set
+.Dv MSG_DONTWAIT
+for all subsequent messages.
+A provided
+.Fa timeout
+limits the time spent in the function but it does not limit the
+time spent in lower parts of the kernel.
+.Pp
+The
+.Fa mmsghdr
+structure has the following form, as defined in
+.In sys/socket.h :
+.Bd -literal
+struct mmsghdr {
+   struct msghdr msg_hdr;
+   unsigned int msg_len;
+};
+.Ed
+.Pp
+Here
+.Fa msg_len
+indicated the number of bytes received for each
+.Fa msg_hdr
+member.
 .Sh RETURN VALUES
-These calls return the number of bytes received, or \-1 if an error occurred.
-.Sh ERRORS
+The
 .Fn recv ,
 .Fn recvfrom ,
 and
 .Fn recvmsg
+calls return the number of bytes received, or \-1 if an error occurred.
+The
+.Fn recvmmsg
+call returns the number of messages received, or \-1
+if an error occurred before the first message has been received. 
+.Sh ERRORS
+.Fn recv ,
+.Fn recvfrom ,
+.Fn recvmsg ,
+and
+.Fn recvmmsg
 fail if:
 .Bl -tag -width "[EHOSTUNREACH]"
 .It Bq Er EBADF
@@ -310,6 +367,8 @@ was larger than
 .Pp
 And
 .Fn recvmsg
+and
+.Fn recvmmsg
 may return one of the following errors:
 .Bl -tag -width Er
 .It Bq Er EINVAL
@@ -364,6 +423,10 @@ The
 .Fn recv
 function call appeared in
 .Bx 4.1c .
+The
+.Fn recvmmsg
+syscall first appeared in Linux 2.6.33 and was added to
+.Ox 7.2 .
 .Sh CAVEATS
 Calling
 .Fn recvmsg
Index: lib/libc/sys/send.2
===
RCS file: /cvs/src/lib/libc/sys/send.2,v
retrieving revision 1.34
diff -u -p -r1.34 send.2
--- lib/libc/sys/send.2 11 Jan 2019 06:10:13 -  1.34
+++ lib/libc/sys/send.2 5 Sep 2022 14:59:10 -
@@ -36,7 +36,8 @@
 .Sh NAME
 .Nm send ,
 .Nm sendto ,
-.Nm sendmsg
+.Nm sendmsg ,
+.Nm sendmmsg
 .Nd send a message from a socket
 .Sh SYNOPSIS
 .In sys/socket.h
@@ -46,19 +47,23 @@
 .Fn sendto "int s" "const void *msg" "size_t len" "int flags" "const struct 
sockaddr *to" "socklen_t tolen"
 .Ft ssize_t
 .Fn sendmsg "int s" "const struct msghdr *msg" "int flags"
+.Ft int
+.Fn sendmmsg "int s" "const struct mmsghdr *mmsg" "unsigned int vlen" "int 
flags"
 .Sh DESCRIPTION
 .Fn send ,
 .Fn sendto ,
+.Fn sendmsg ,
 and
-.Fn sendmsg
+.Fn sendmmsg
 are used to transmit a message to another socket.
 .Fn send
 may be used only when the socket is in a
 .Em connected
 state, while
-.Fn sendto
+.Fn sendto ,
+.Fn sendmsg ,
 and
-.Fn sendmsg
+.Fn sendmmsg
 may be used at any time.
 .Pp
 The address of the target is given by
@@ -127,10 +132,21 @@ See
 .Xr recv 2
 for a description of the
 .Fa msghdr
-structure.
+and
+.Fa mmsghdr
+structures.
 .Sh RETURN VALUES
-The call returns the number of characters sent, or \-1
+The
+.Fn send ,
+.Fn sendto ,
+and
+.Fn sendmsg
+calls return the number of characters sent, or \-1
 if an error occurred.
+The
+.Fn sendmmsg
+call returns the number of messages sent, or \-1
+if an error occurred before the first message has been sent. 
 .Sh ERRORS
 .Fn send ,
 .Fn sendto ,
@@ -267,3 +283,7 @@ The
 .Fn send
 function call appeared in
 .Bx 4.1c .
+The
+.Fn sendmmsg
+syscall first appeared in Linux 3.0 and was added to
+.Ox 7.2 .
Index: lib/libc/sys/w_recvmmsg.c
=======
RCS file: lib/libc/sys/w_recvmmsg.c
diff -N lib/libc/sys/w_recvmmsg.c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ lib/libc/sys/w_recvmmsg.c   6 Sep 2022 09:42:08 -
@@ -0,0 +1,32 @@
+/* $OpenBSD$ */
+/*
+ * Copyright (c) 2022 Moritz Buhl 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF

Re: add sendmmsg and recvmmsg systemcalls

2022-09-02 Thread Moritz Buhl
Here is an updated version of the kernel part for sendmmsg.

mbuhl

Index: sys/kern/syscalls.master
===
RCS file: /mount/openbsd/cvs/src/sys/kern/syscalls.master,v
retrieving revision 1.230
diff -u -p -r1.230 syscalls.master
--- sys/kern/syscalls.master2 Sep 2022 13:18:06 -   1.230
+++ sys/kern/syscalls.master2 Sep 2022 20:34:15 -
@@ -247,7 +247,9 @@
 116STD NOLOCK  { int sys_recvmmsg(int s, struct mmsghdr *mmsg, \
unsigned int vlen, unsigned int flags, \
struct timespec *timeout); }
-117UNIMPL  sendmmsg
+117STD NOLOCK  { int sys_sendmmsg(int s,  \
+   struct mmsghdr *mmsg, unsigned int vlen, \
+   unsigned int flags); }
 118STD { int sys_getsockopt(int s, int level, int name, \
void *val, socklen_t *avalsize); }
 119STD { int sys_thrkill(pid_t tid, int signum, void *tcb); }
Index: sys/kern/uipc_syscalls.c
===
RCS file: /mount/openbsd/cvs/src/sys/kern/uipc_syscalls.c,v
retrieving revision 1.202
diff -u -p -r1.202 uipc_syscalls.c
--- sys/kern/uipc_syscalls.c2 Sep 2022 13:18:06 -   1.202
+++ sys/kern/uipc_syscalls.c2 Sep 2022 23:26:08 -
@@ -606,6 +606,92 @@ done:
 }
 
 int
+sys_sendmmsg(struct proc *p, void *v, register_t *retval)
+{
+   struct sys_sendmmsg_args /* {
+   syscallarg(int) s;
+   syscallarg(struct mmsghdr *)mmsg;
+   syscallarg(unsigned int)vlen;
+   syscallarg(unsigned int)flags;
+   } */ *uap = v;
+   struct mmsghdr mmsg, *mmsgp;
+   struct iovec aiov[UIO_SMALLIOV], *iov = aiov, *uiov;
+   size_t iovlen = UIO_SMALLIOV;
+   register_t retsnd;
+   unsigned int vlen, dgrams;
+   int error = 0;
+
+   /* Arbitrarily capped at 1024 datagrams. */
+   vlen = SCARG(uap, vlen);
+   if (vlen > 1024)
+   vlen = 1024;
+
+   mmsgp = SCARG(uap, mmsg);
+   for (dgrams = 0; dgrams < vlen; dgrams++) {
+   error = copyin([dgrams], , sizeof(mmsg));
+   if (error)
+   break;
+
+#ifdef KTRACE
+   if (KTRPOINT(p, KTR_STRUCT))
+   ktrmmsghdr(p, );
+#endif
+
+   if (mmsg.msg_hdr.msg_iovlen > IOV_MAX) {
+   error = EMSGSIZE;
+   break;
+   }
+
+   if (mmsg.msg_hdr.msg_iovlen > iovlen) {
+   if (iov != aiov)
+   free(iov, M_IOV, iovlen *
+   sizeof(struct iovec));
+
+   iovlen = mmsg.msg_hdr.msg_iovlen;
+   iov = mallocarray(iovlen, sizeof(struct iovec),
+   M_IOV, M_WAITOK);
+   }
+
+   if (mmsg.msg_hdr.msg_iovlen > 0) {
+   error = copyin(mmsg.msg_hdr.msg_iov, iov,
+   mmsg.msg_hdr.msg_iovlen * sizeof(struct iovec));
+   if (error)
+   break;
+   }
+
+#ifdef KTRACE
+   if (mmsg.msg_hdr.msg_iovlen && KTRPOINT(p, KTR_STRUCT))
+   ktriovec(p, iov, mmsg.msg_hdr.msg_iovlen);
+#endif
+
+   uiov = mmsg.msg_hdr.msg_iov;
+   mmsg.msg_hdr.msg_iov = iov;
+   mmsg.msg_hdr.msg_flags = 0;
+
+   error = sendit(p, SCARG(uap, s), _hdr,
+   SCARG(uap, flags), );
+   if (error)
+   break;
+
+   mmsg.msg_hdr.msg_iov = uiov;
+   mmsg.msg_len = retsnd;
+
+   error = copyout(, [dgrams], sizeof(mmsg));
+   if (error)
+   break;
+   }
+
+   if (iov != aiov)
+   free(iov, M_IOV, sizeof(struct iovec) * iovlen);
+
+   *retval = dgrams;
+
+   if (dgrams)
+   return 0;
+   return error;
+}
+
+int
 sendit(struct proc *p, int s, struct msghdr *mp, int flags, register_t 
*retsize)
 {
struct file *fp;
Index: sys/sys/socket.h
===
RCS file: /mount/openbsd/cvs/src/sys/sys/socket.h,v
retrieving revision 1.103
diff -u -p -r1.103 socket.h
--- sys/sys/socket.h2 Sep 2022 13:18:07 -   1.103
+++ sys/sys/socket.h2 Sep 2022 22:31:09 -
@@ -579,6 +579,7 @@ ssize_t send(int, const void *, size_t, 
 ssize_tsendto(int, const void *,
size_t, int, const struct sockaddr *, socklen_t);
 ssize_tsendmsg(int, const struct msghdr *, int);
+intsendmmsg(int, struct mmsghdr *, unsigned int, unsigned int);
 intsetsockopt(int, int, int, const void *, socklen_t);
 intshutdown(int, int);
 int

Re: add sendmmsg and recvmmsg systemcalls

2022-09-01 Thread Moritz Buhl
I addressed your concerns as well as these of jca, just the kernel
part (and the new ktrace stuff) below.

One minor thing: I didn't see any kdump output where one struct was
contained in another one but I am printing it like ddb would so I
guess it should be fine.


Index: kern/syscalls.master
===
RCS file: /mount/openbsd/cvs/src/sys/kern/syscalls.master,v
retrieving revision 1.229
diff -u -p -r1.229 syscalls.master
--- kern/syscalls.master1 Aug 2022 14:56:59 -   1.229
+++ kern/syscalls.master1 Sep 2022 14:52:47 -
@@ -244,8 +244,10 @@
const char *permissions); }
 115STD { int sys___realpath(const char *pathname, \
char *resolved); }
-116OBSOL   t32_gettimeofday
-117OBSOL   t32_getrusage
+116STD NOLOCK  { int sys_recvmmsg(int s, struct mmsghdr *mmsg, \
+   unsigned int vlen, unsigned int flags, \
+   struct timespec *timeout); }
+117UNIMPL  sendmmsg
 118STD { int sys_getsockopt(int s, int level, int name, \
void *val, socklen_t *avalsize); }
 119STD { int sys_thrkill(pid_t tid, int signum, void *tcb); }
Index: kern/uipc_syscalls.c
===
RCS file: /mount/openbsd/cvs/src/sys/kern/uipc_syscalls.c,v
retrieving revision 1.201
diff -u -p -r1.201 uipc_syscalls.c
--- kern/uipc_syscalls.c14 Aug 2022 01:58:28 -  1.201
+++ kern/uipc_syscalls.c1 Sep 2022 14:37:26 -
@@ -805,6 +805,140 @@ done:
 }
 
 int
+sys_recvmmsg(struct proc *p, void *v, register_t *retval)
+{
+   struct sys_recvmmsg_args /* {
+   syscallarg(int) s;
+   syscallarg(struct mmsghdr *)mmsg;
+   syscallarg(unsigned int)vlen;
+   syscallarg(unsigned int)flags;
+   syscallarg(struct timespec *)   timeout;
+   } */ *uap = v;
+   struct mmsghdr mmsg, *mmsgp;
+   struct timespec ts, now;
+   struct iovec aiov[UIO_SMALLIOV], *uiov, *iov = aiov;
+   struct file *fp;
+   struct socket *so;
+   struct timespec *timeout;
+   size_t iovlen = UIO_SMALLIOV;
+   register_t retrec;
+   unsigned int vlen, dgrams;
+   int error = 0, flags, s;
+
+   s = SCARG(uap, s);
+   if ((error = getsock(p, s, )))
+   return (error);
+   so = (struct socket *)fp->f_data;
+
+   timeout = SCARG(uap, timeout);
+   if (timeout != NULL) {
+   error = copyin(timeout, , sizeof(ts));
+   if (error)
+   return error;
+#ifdef KTRACE
+   if (KTRPOINT(p, KTR_STRUCT))
+   ktrreltimespec(p, );
+#endif
+   getnanotime();
+   timespecadd(, , );
+   }
+
+   flags = SCARG(uap, flags);
+
+   /* Arbitrarily capped at 1024 datagrams. */
+   vlen = SCARG(uap, vlen);
+   if (vlen > 1024)
+   vlen = 1024;
+
+   mmsgp = SCARG(uap, mmsg);
+   for (dgrams = 0; dgrams < vlen;) {
+   error = copyin([dgrams], , sizeof(mmsg));
+   if (error)
+   break;
+
+   if (mmsg.msg_hdr.msg_iovlen > IOV_MAX) {
+   error = EMSGSIZE;
+   break;
+   }
+
+   if (mmsg.msg_hdr.msg_iovlen > iovlen) {
+   if (iov != aiov)
+   free(iov, M_IOV, iovlen *
+   sizeof(struct iovec));
+
+   iovlen = mmsg.msg_hdr.msg_iovlen;
+   iov = mallocarray(iovlen, sizeof(struct iovec),
+   M_IOV, M_WAITOK);
+   }
+
+   if (mmsg.msg_hdr.msg_iovlen > 0) {
+   error = copyin(mmsg.msg_hdr.msg_iov, iov,
+   mmsg.msg_hdr.msg_iovlen * sizeof(struct iovec));
+   if (error)
+   break;
+   }
+
+   uiov = mmsg.msg_hdr.msg_iov;
+   mmsg.msg_hdr.msg_iov = iov;
+   mmsg.msg_hdr.msg_flags = flags;
+
+   error = recvit(p, s, _hdr, NULL, );
+   if (error) {
+   if (error == EAGAIN && dgrams > 0)
+   error = 0;
+   break;
+   }
+
+   if (dgrams == 0 && flags & MSG_WAITFORONE) {
+   flags &= ~MSG_WAITFORONE;
+   flags |= MSG_DONTWAIT;
+   }
+
+   mmsg.msg_hdr.msg_iov = uiov;
+   mmsg.msg_len = retrec;
+#ifdef KTRACE
+   if (KTRPOINT(p, KTR_STRUCT)) {
+   ktrmmsghdr(p, );
+   if 

Re: add sendmmsg and recvmmsg systemcalls

2022-08-31 Thread Moritz Buhl
I created a pull request on their github:
https://github.com/NLnetLabs/nsd/pull/231

On Tue, Aug 30, 2022 at 04:33:10PM -0600, Todd C. Miller wrote:
> On Wed, 31 Aug 2022 00:19:20 +0200, Moritz Buhl wrote:
> 
> > On Tue, Aug 30, 2022 at 10:59:43PM +0200, Claudio Jeker wrote:
> > > And nsd in base. It seems unbound does not use recvmmsg. 
> >
> > After 'make -f Makefile.bsd-wrapper config' recvmmsg is picked up.
> > The config compile test currently defines NONBLOCKING_IS_BROKEN
> > because of a missing include.  Then vlen for recvmmsg would always
> > be 1.
> >
> > checking for struct mmsghdr... yes
> > checking for recvmmsg... yes
> > checking for sendmmsg... no
> > ...
> > checking if nonblocking sockets work... yes
> >
> > If anybody knows how to make nsd call nsd_recvmmsg this would
> > make for some nice testing.
> 
> Does this also fix the configure test?  If so, we can submit it
> upstream.
> 
>  - todd
> 
> Index: acx_nlnetlabs.m4
> ===
> RCS file: /cvs/src/usr.sbin/nsd/acx_nlnetlabs.m4,v
> retrieving revision 1.7
> diff -u -p -u -r1.7 acx_nlnetlabs.m4
> --- acx_nlnetlabs.m4  24 Oct 2021 12:14:18 -  1.7
> +++ acx_nlnetlabs.m4  30 Aug 2022 22:31:48 -
> @@ -963,6 +963,9 @@ AC_LANG_SOURCE([[
>  #ifdef HAVE_SYS_TYPES_H
>  #include 
>  #endif
> +#ifdef HAVE_SYS_SELECT_H
> +#include 
> +#endif
>  #ifdef HAVE_SYS_SOCKET_H
>  #include 
>  #endif
> Index: configure
> ===
> RCS file: /cvs/src/usr.sbin/nsd/configure,v
> retrieving revision 1.56
> diff -u -p -u -r1.56 configure
> --- configure 30 Jun 2022 10:49:39 -  1.56
> +++ configure 30 Aug 2022 22:32:08 -
> @@ -6593,6 +6593,9 @@ else
>  #ifdef HAVE_SYS_TYPES_H
>  #include 
>  #endif
> +#ifdef HAVE_SYS_SELECT_H
> +#include 
> +#endif
>  #ifdef HAVE_SYS_SOCKET_H
>  #include 
>  #endif
> 



Re: add sendmmsg and recvmmsg systemcalls

2022-08-30 Thread Moritz Buhl
On Tue, Aug 30, 2022 at 10:59:43PM +0200, Claudio Jeker wrote:
> And nsd in base. It seems unbound does not use recvmmsg. 

After 'make -f Makefile.bsd-wrapper config' recvmmsg is picked up.
The config compile test currently defines NONBLOCKING_IS_BROKEN
because of a missing include.  Then vlen for recvmmsg would always
be 1.

checking for struct mmsghdr... yes
checking for recvmmsg... yes
checking for sendmmsg... no
...
checking if nonblocking sockets work... yes

If anybody knows how to make nsd call nsd_recvmmsg this would
make for some nice testing.

mbuhl

Index: Makefile.bsd-wrapper
===
RCS file: /mount/openbsd/cvs/src/usr.sbin/nsd/Makefile.bsd-wrapper,v
retrieving revision 1.19
diff -u -p -r1.19 Makefile.bsd-wrapper
--- Makefile.bsd-wrapper30 Jun 2021 11:50:22 -  1.19
+++ Makefile.bsd-wrapper30 Aug 2022 22:09:46 -
@@ -22,6 +22,7 @@ CONFIGURE_OPTS=   --prefix=/usr \
--with-xfrdfile=${CHROOTDIR}/run/xfrd.state \
--with-libevent=/usr \
--enable-ratelimit \
+   --enable-recvmmsg \
--enable-root-server
 
 PROG=  nsd nsd-checkconf nsd-checkzone nsd-control
Index: configure
===
RCS file: /mount/openbsd/cvs/src/usr.sbin/nsd/configure,v
retrieving revision 1.56
diff -u -p -r1.56 configure
--- configure   30 Jun 2022 10:49:39 -  1.56
+++ configure   30 Aug 2022 22:04:41 -
@@ -6609,6 +6609,8 @@ else
 #include 
 #endif
 
+#include 
+
 int main(void)
 {
int port;



Re: add sendmmsg and recvmmsg systemcalls

2022-08-30 Thread Moritz Buhl
ype of socket
-the message is received from (see
-.Xr socket 2 ) .
-.Pp
 If no messages are available at the socket, the
 receive call waits for a message to arrive, unless
 the socket is nonblocking (see
@@ -158,6 +159,8 @@ The
 .Dv MSG_CMSG_CLOEXEC
 requests that any file descriptors received as ancillary data with
 .Fn recvmsg
+and
+.Fn recvmmsg
 (see below)
 have their close-on-exec flag set.
 .Pp
@@ -249,13 +252,67 @@ Indicates that the packet was received a
 .It Dv MSG_MCAST
 Indicates that the packet was received as multicast.
 .El
+.Pp
+The
+.Fn recvmmsg
+call uses an array of the
+.Fa mmsghdr
+structure of length
+.Fa vlen
+to group multiple
+.Fa msghdr
+structures into a single system call.
+.Fa vlen
+is capped at maximum
+.Dv 1024
+messages that are received in a single call.
+The
+.Fa flags
+field allows setting
+.Dv MSG_WAITFORONE
+to wait for one
+.Fa msghdr ,
+and set
+.Dv MSG_DONTWAIT
+for all subsequent messages.
+A provided
+.Fa timeout
+limits the time spent in the function but it does not limit the
+time spent in lower parts of the kernel.
+.Pp
+The
+.Fa mmsghdr
+structure has the following form, as defined in
+.In sys/socket.h :
+.Bd -literal
+struct mmsghdr {
+   struct msghdr msg_hdr;
+   unsigned int msg_len;
+};
+.Ed
+.Pp
+Here
+.Fa msg_len
+indicated the number of bytes received for each
+.Fa msg_hdr
+member.
 .Sh RETURN VALUES
-These calls return the number of bytes received, or \-1 if an error occurred.
-.Sh ERRORS
+The
 .Fn recv ,
 .Fn recvfrom ,
 and
 .Fn recvmsg
+calls return the number of bytes received, or \-1 if an error occurred.
+The
+.Fn recvmmsg
+call returns the number of messages received, or \-1
+if an error occurred before the first message has been received. 
+.Sh ERRORS
+.Fn recv ,
+.Fn recvfrom ,
+.Fn recvmsg ,
+and
+.Fn recvmmsg
 fail if:
 .Bl -tag -width "[EHOSTUNREACH]"
 .It Bq Er EBADF
@@ -310,6 +367,8 @@ was larger than
 .Pp
 And
 .Fn recvmsg
+and
+.Fn recvmmsg
 may return one of the following errors:
 .Bl -tag -width Er
 .It Bq Er EINVAL
@@ -364,6 +423,11 @@ The
 .Fn recv
 function call appeared in
 .Bx 4.1c .
+The
+.Fn recvmmsg
+syscall first appeared in Linux 2.6.33
+and has been available since
+.Ox 7.2 .
 .Sh CAVEATS
 Calling
 .Fn recvmsg
Index: lib/libc/sys/w_recvmmsg.c
===
RCS file: lib/libc/sys/w_recvmmsg.c
diff -N lib/libc/sys/w_recvmmsg.c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ lib/libc/sys/w_recvmmsg.c   22 Apr 2022 16:03:30 -
@@ -0,0 +1,32 @@
+/* $OpenBSD$ */
+/*
+ * Copyright (c) 2022 Moritz Buhl 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+#include "cancel.h"
+
+int
+recvmmsg(int fd, struct mmsghdr *mmsg, unsigned int vlen, unsigned int flags,
+   struct timespec *ts)
+{
+   int ret;
+
+   ENTER_CANCEL_POINT(1);
+   ret = HIDDEN(recvmmsg)(fd, mmsg, vlen, flags, ts);
+   LEAVE_CANCEL_POINT(ret == -1);
+   return (ret);
+}
+DEF_CANCEL(recvmmsg);
Index: regress/lib/libc/sys/Makefile
===
RCS file: /cvs/src/regress/lib/libc/sys/Makefile,v
retrieving revision 1.15
diff -u -p -r1.15 Makefile
--- regress/lib/libc/sys/Makefile   6 Jan 2022 03:30:15 -   1.15
+++ regress/lib/libc/sys/Makefile   30 Aug 2022 15:44:29 -
@@ -54,6 +54,7 @@ PROGS +=  t_pipe2
 PROGS +=   t_poll
 PROGS +=   t_ppoll
 PROGS +=   t_ptrace
+PROGS +=   t_recvmmsg
 PROGS +=   t_revoke
 PROGS +=   t_select
 PROGS +=   t_sendrecv
Index: regress/lib/libc/sys/atf-c.h
===
RCS file: /cvs/src/regress/lib/libc/sys/atf-c.h,v
retrieving revision 1.4
diff -u -p -r1.4 atf-c.h
--- regress/lib/libc/sys/atf-c.h28 May 2022 18:39:39 -  1.4
+++ regress/lib/libc/sys/atf-c.h30 Aug 2022 15:44:29 -
@@ -76,6 +76,7 @@ ATF_TC_FUNCTIONS(fn)
 #define ATF_CHECK  ATF_REQUIRE
 #define ATF_CHECK_MSG  ATF_REQUIRE_MSG
 #define ATF_CHECK_EQ   ATF_REQUIRE_EQ
+#define ATF_CHECK_EQ_MSG   ATF_REQUIRE_EQ_MSG
 #define ATF_CHECK_ERRNOATF_REQUIRE_ERRNO
 #define ATF_CHECK_STREQATF_REQUIRE_STREQ
 
Index: regress/lib/libc/sys/t_rec

Re: delete unused variable in ix(4) rx checksum calc

2022-08-30 Thread Moritz Buhl
ok mbuhl

On Tue, Aug 30, 2022 at 05:13:35PM +0200, Sebastian Benoit wrote:
> ptype is never used.
> ok?
> 
> diff --git sys/dev/pci/if_ix.c sys/dev/pci/if_ix.c
> index cb233034d23..72a221b97d9 100644
> --- sys/dev/pci/if_ix.c
> +++ sys/dev/pci/if_ix.c
> @@ -148,7 +148,7 @@ void  ixgbe_enable_intr(struct ix_softc *);
>  void ixgbe_disable_intr(struct ix_softc *);
>  int  ixgbe_txeof(struct tx_ring *);
>  int  ixgbe_rxeof(struct rx_ring *);
> -void ixgbe_rx_checksum(uint32_t, struct mbuf *, uint32_t);
> +void ixgbe_rx_checksum(uint32_t, struct mbuf *);
>  void ixgbe_iff(struct ix_softc *);
>  void ixgbe_map_queue_statistics(struct ix_softc *);
>  void ixgbe_update_link_status(struct ix_softc *);
> @@ -3103,7 +3103,7 @@ ixgbe_rxeof(struct rx_ring *rxr)
>   struct mbuf *mp, *sendmp;
>   uint8_t  eop = 0;
>   uint16_t len, vtag;
> - uint32_t staterr = 0, ptype;
> + uint32_t staterr = 0;
>   struct ixgbe_rx_buf *rxbuf, *nxbuf;
>   union ixgbe_adv_rx_desc *rxdesc;
>   size_t   dsize = sizeof(union ixgbe_adv_rx_desc);
> @@ -3140,8 +3140,6 @@ ixgbe_rxeof(struct rx_ring *rxr)
>  
>   mp = rxbuf->buf;
>   len = letoh16(rxdesc->wb.upper.length);
> - ptype = letoh32(rxdesc->wb.lower.lo_dword.data) &
> - IXGBE_RXDADV_PKTTYPE_MASK;
>   vtag = letoh16(rxdesc->wb.upper.vlan);
>   eop = ((staterr & IXGBE_RXD_STAT_EOP) != 0);
>   hash = lemtoh32(>wb.lower.hi_dword.rss);
> @@ -3213,7 +3211,7 @@ ixgbe_rxeof(struct rx_ring *rxr)
>   sendmp = NULL;
>   mp->m_next = nxbuf->buf;
>   } else { /* Sending this frame? */
> - ixgbe_rx_checksum(staterr, sendmp, ptype);
> + ixgbe_rx_checksum(staterr, sendmp);
>  
>   if (hashtype != IXGBE_RXDADV_RSSTYPE_NONE) {
>   sendmp->m_pkthdr.ph_flowid = hash;
> @@ -3251,7 +3249,7 @@ next_desc:
>   *
>   */
>  void
> -ixgbe_rx_checksum(uint32_t staterr, struct mbuf * mp, uint32_t ptype)
> +ixgbe_rx_checksum(uint32_t staterr, struct mbuf * mp)
>  {
>   uint16_t status = (uint16_t) staterr;
>   uint8_t  errors = (uint8_t) (staterr >> 24);
> 



Re: ipv6 fragment checksum

2022-08-08 Thread Moritz Buhl
On Sun, Aug 07, 2022 at 12:41:29AM +0200, Alexander Bluhm wrote:
> Hi,
> 
> If interface drivers have enabled transmit offloading for the payload
> checksum , IPv6 fragments contain invalid checksum.  For fragments
> the protocol checksum has to be calculated before fragmentation.
> Hardware cannot do this as it is too late.  Do it earlier in software.
> 
> ip_fragement() has such code, but in IPv6 it is missing.  Note that
> in6_proto_cksum_out() has to be called before the next protocol is
> set to IPPROTO_FRAGMENT.
> 
> ok?
> 
> bluhm
> 
> Index: netinet6/ip6_output.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/netinet6/ip6_output.c,v
> retrieving revision 1.269
> diff -u -p -r1.269 ip6_output.c
> --- netinet6/ip6_output.c 29 Jun 2022 22:45:24 -  1.269
> +++ netinet6/ip6_output.c 6 Aug 2022 22:21:04 -
> @@ -729,6 +729,12 @@ reroute:
>   mtu = IPV6_MAXPACKET;
>  
>   /*
> +  * If we are doing fragmentation, we can't defer TCP/UDP
> +  * checksumming; compute the checksum and clear the flag.
> +  */
> +in6_proto_cksum_out(m, NULL);
> +
> + /*
>* Change the next header field of the last header in the
>* unfragmentable part.
>*/
> 

tested this using udp on em (with and without my offloading diff),
igc, and ix openbsd-linux and linux-openbsd-linux-forwarding.
ok mbuhl



pfctl: possible memory leak in load_feedback_profile

2022-07-27 Thread Moritz Buhl
Dear tech@,

the scan-build(1) tool from the llvm port on sbin/pfctl reported a
memory leak in the error path of load_feedback_profile.

There are more places that could be investigated.
Would it make sense to look into these?

Independent of that I was wondering if it makes sense to warn if
pfctl_optimize_ruleset fails.  There cases should exist where
the function with 1 without printing a warning.

Any feedback?
mbuhl


Index: sbin/pfctl/pfctl.c
===
RCS file: /cvs/src/sbin/pfctl/pfctl.c,v
retrieving revision 1.388
diff -u -p -r1.388 pfctl.c
--- sbin/pfctl/pfctl.c  27 Jul 2022 12:28:27 -  1.388
+++ sbin/pfctl/pfctl.c  27 Jul 2022 12:46:05 -
@@ -1452,7 +1452,8 @@ pfctl_load_ruleset(struct pfctl *pf, cha
}
 
if (pf->optimize)
-   pfctl_optimize_ruleset(pf, rs);
+   if (pfctl_optimize_ruleset(pf, rs))
+   warnx("ruleset optimizations failed");
 
while ((r = TAILQ_FIRST(rs->rules.active.ptr)) != NULL) {
TAILQ_REMOVE(rs->rules.active.ptr, r, entries);
Index: sbin/pfctl/pfctl_optimize.c
===
RCS file: /cvs/src/sbin/pfctl/pfctl_optimize.c,v
retrieving revision 1.49
diff -u -p -r1.49 pfctl_optimize.c
--- sbin/pfctl/pfctl_optimize.c 28 Jan 2022 05:24:15 -  1.49
+++ sbin/pfctl/pfctl_optimize.c 27 Jul 2022 12:46:05 -
@@ -869,13 +869,13 @@ load_feedback_profile(struct pfctl *pf, 
struct pf_ruleset *rs;
if ((por = calloc(1, sizeof(*por))) == NULL) {
warn("calloc");
-   return (1);
+   goto free_queue;
}
pr.nr = nr;
if (ioctl(pf->dev, DIOCGETRULE, ) == -1) {
warnx("%s", pf_strerror(errno));
free(por);
-   return (1);
+   goto free_queue;
}
memcpy(>por_rule, , sizeof(por->por_rule));
rs = pf_find_or_create_ruleset(pr.anchor_call);
@@ -884,7 +884,7 @@ load_feedback_profile(struct pfctl *pf, 
}
 
if (construct_superblocks(pf, , _superblocks))
-   return (1);
+   goto free_queue;
 
 
/*
@@ -920,6 +920,14 @@ load_feedback_profile(struct pfctl *pf, 
blockcur = block;
}
return (0);
+
+free_queue:
+   while ((por = TAILQ_FIRST())) {
+   TAILQ_REMOVE(, por, por_entry);
+   free(por);
+   }
+
+   return (1);
 }
 
 



Re: pf.conf(5): document new anchors limit

2022-07-23 Thread Moritz Buhl
On Thu, Jul 21, 2022 at 10:25:09PM +0100, Jason McIntyre wrote:
...
> it looks like the "set limit" text in pf.conf(5) might need some small
> adjustments:
> 
> - as well as the "anchors" limit, it does not document "pktdelay-pkts"
> 
> - for entries where defaults are not documented, it is not clear whether
>   this is an omission or they are just not limited by default (in the
>   same way that things like table numbers are limited). those affected
>   seem to be src-nodes and pktdelay-pkts

All of these limits have defaults that are sometimes set in the
kernel or with the first pfctl invocation during boot:

/usr/src/sbin/pfctl.c:
1726 pf->limit[PF_LIMIT_SRC_NODES] = PFSNODE_HIWAT;
...
1729 pf->limit[PF_LIMIT_PKTDELAY_PKTS] = PF_PKTDELAY_MAXPKTS;

/usr/include/net/pfvar.h:
#define PFSNODE_HIWAT   1   /* default source node table size */
...
#define PF_PKTDELAY_MAXPKTS 1   /* max # of pkts held in delay queue */


> - the two entries for "table-entries" are confusing. it seems to be that
>   machines with less than a specific amount of memory have their entries
>   limited to the value of _SMALL. but the way it's documented makes that
>   unclear. i'm not sure whether the reader needs the names such as
>   PFSTATE_HIWAT. i think it's just confusing to list it this way. we
>   should probably have one item, table-entries, and say what the default
>   is normally, and what it is for lesser memory setups.

/usr/src/sbin/pfctl.c:
1737 if (mem <= 100*1024*1024)
1738 pf->limit[PF_LIMIT_TABLE_ENTRIES] = PFR_KENTRY_HIWAT_SMALL;

If the machine has less than 100MB physical memory, ..._SMALL is set for
PF_LIMIT_TABLE_ENTRIES.

> - i think the whole section should just be reduced to a simple list of
>   what can be set, and any default values.
> 
> here's a stab at tidying up. i've inserted a couple of XXX for where i
> came unstuck.
> 
> thoughts? help?

I appreciate where this is going.

mbuhl

> jmc
> 
> Index: pf.conf.5
> ===
> RCS file: /cvs/src/share/man/man5/pf.conf.5,v
> retrieving revision 1.596
> diff -u -p -r1.596 pf.conf.5
> --- pf.conf.5 27 May 2022 15:45:02 -  1.596
> +++ pf.conf.5 21 Jul 2022 21:22:08 -
> @@ -1236,65 +1236,56 @@ See
>  .Xr pool 9
>  for an explanation of memory pools.
>  .Pp
> -For example,
> -to set the maximum number of entries in the memory pool used by state table
> -entries (generated by
> +Limits can be set on the following:
> +.Bl -tag -width pktdelay_pkts
> +.It Cm states
> +Set the maximum number of entries in the memory pool used by state table
> +entries (those generated by
>  .Ic pass
>  rules which do not specify
> -.Cm no state )
> -to 2:
> -.Pp
> -.Dl set limit states 2
> -.Pp
> -To set the maximum number of entries in the memory pool used for fragment
> -reassembly to 2000:
> -.Pp
> -.Dl set limit frags 2000
> -.Pp
> -This maximum may not exceed, and should be well below, the maximum number
> -of mbuf clusters
> -.Pq sysctl kern.maxclusters
> -in the system.
> -.Pp
> -To set the maximum number of entries in the memory pool used for tracking
> +.Cm no state ) .
> +The default is 10.
> +.It Cm src-nodes
> +Set the maximum number of entries in the memory pool used for tracking
>  source IP addresses (generated by the
>  .Cm sticky-address
>  and
>  .Cm src.track
> -options) to 2000:
> -.Pp
> -.Dl set limit src-nodes 2000
> -.Pp
> -To set limits on the memory pools used by tables:
> -.Bd -literal -offset indent
> -set limit tables 1000
> -set limit table-entries 10
> -.Ed
> -.Pp
> -The first limits the number of tables that can exist to 1000.
> -The second limits the overall number of addresses that can be stored
> -in tables to 10.
> -.Pp
> -Various limits can be combined on a single line:
> -.Bd -literal -offset indent
> -set limit { states 2, frags 2000, src-nodes 2000 }
> -.Ed
> -.Pp
> -.Xr pf 4
> -has the following defaults:
> -.Bl -column table-entries PFR_KENTRY_HIWAT_SMALL platform_dependent
> -.It states Ta Dv PFSTATE_HIWAT Ta Pq 10
> -.It tables Ta Dv PFR_KTABLE_HIWAT Ta Pq 1000
> -.It table-entries Ta Dv PFR_KENTRY_HIWAT Ta Pq 20
> -.It table-entries Ta Dv PFR_KENTRY_HIWAT_SMALL Ta Pq 10
> -.It frags Ta Dv NMBCLUSTERS Ns /32 Ta Pq platform dependent
> -.El
> -.Pp
> +options).
> +The default is
> +.\" XXX
> +.It Cm frags
> +Set the maximum number of entries in the memory pool used for fragment
> +reassembly.
> +The maximum may not exceed, and should be well below,
> +the maximum number of mbuf clusters
> +.Pq sysctl kern.maxclusters
> +in the system.
> +The default is NMBCLUSTERS/32.
>  .Dv NMBCLUSTERS
>  defines the total number of packets which can exist in-system at any one 
> time.
>  Refer to
>  .In machine/param.h
>  for the platform-specific value.
> +.It Cm tables
> +Set the number of tables that can exist.
> +The default is 1000.
> +.It Cm table-entries
> +Set the number of addresses 

Re: checksum offloading for em

2022-07-22 Thread Moritz Buhl
On Mon, Jun 27, 2022 at 08:07:32AM +1000, Jonathan Gray wrote:
> On Sun, Jun 26, 2022 at 04:43:59PM +0200, Moritz Buhl wrote:
> > On Sun, Jun 26, 2022 at 12:23:58PM +0200, Moritz Buhl wrote:
> > > Hi,
> > > 
> > > I noticed that for some offloading-capable em controllers checksum
> > > offloading is still disabled and I couldn't find a reason for that.
> 
> There are two descriptor formats on 82575/82576/i350/i354/i210/i211.
> The older one we use and the newer igb/ix style one we don't use in em.
> A lot of the offloading options are in the newer descriptor format
> from memory.

Thanks that helped a lot.  Below is a diff that implements offloading
for i350 and i210 (I haven't checked it for the others you mentioned).
It also does tcp and upd checksums for the older ones also ipv4.

I didn't feel confident in the diff but I currently cannot cause
any misbehavoir.

Using VLAN with i350 and i210 should still break because
IFCAP_VLAN_HWTAGGING is not fully implemented yet.

I would appreciate any feedback and comments.

mbuhl


Index: dev/pci/if_em.c
===
RCS file: /cvs/src/sys/dev/pci/if_em.c,v
retrieving revision 1.362
diff -u -p -r1.362 if_em.c
--- dev/pci/if_em.c 23 Jun 2022 09:38:28 -  1.362
+++ dev/pci/if_em.c 22 Jul 2022 12:37:27 -
@@ -37,6 +37,8 @@ POSSIBILITY OF SUCH DAMAGE.
 #include 
 #include 
 
+#include 
+
 /*
  *  Driver version
  */
@@ -278,6 +280,8 @@ void em_receive_checksum(struct em_softc
 struct mbuf *);
 u_int  em_transmit_checksum_setup(struct em_queue *, struct mbuf *, u_int,
u_int32_t *, u_int32_t *);
+u_int  em_tx_ctx_setup(struct em_queue *, struct mbuf *, u_int, u_int32_t *,
+   u_int32_t *);
 void em_iff(struct em_softc *);
 void em_update_link_status(struct em_softc *);
 int  em_get_buf(struct em_queue *, int);
@@ -1221,11 +1225,12 @@ em_encap(struct em_queue *que, struct mb
}
 
if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350) {
+   sc->hw.mac_type != em_82576 && sc->hw.mac_type != em_82580 &&
+   sc->hw.mac_type != em_i210 && sc->hw.mac_type != em_i350) {
used += em_transmit_checksum_setup(que, m, head,
_upper, _lower);
+   } else if (sc->hw.mac_type == em_i210 || sc->hw.mac_type == em_i350) {
+   used += em_tx_ctx_setup(que, m, head, _upper, _lower);
} else {
txd_upper = txd_lower = 0;
}
@@ -1971,10 +1976,11 @@ em_setup_interface(struct em_softc *sc)
 #endif
 
if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350)
+   sc->hw.mac_type != em_82576 && sc->hw.mac_type != em_82580) {
+   ifp->if_capabilities |= IFCAP_CSUM_IPv4;
ifp->if_capabilities |= IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4;
+   ifp->if_capabilities |= IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
+   }
 
/* 
 * Specify the media types supported by this adapter and register
@@ -2391,6 +2397,96 @@ em_free_transmit_structures(struct em_so
}
 }
 
+u_int
+em_tx_ctx_setup(struct em_queue *que, struct mbuf *mp, u_int head,
+u_int32_t *olinfo_status, u_int32_t *cmd_type_len)
+{
+   struct e1000_adv_tx_context_desc *TD;
+   struct ether_header *eh = mtod(mp, struct ether_header *);
+   struct mbuf *m;
+   uint32_t vlan_macip_lens = 0, type_tucmd_mlhl = 0;
+   uint32_t iphlen;
+   int off = 0, hoff;
+   uint8_t ipproto;
+
+   *olinfo_status = 0;
+   *cmd_type_len = 0;
+   TD = (struct e1000_adv_tx_context_desc *)>tx.sc_tx_desc_ring[head];
+   
+   // XXX: VLAN TAGGING
+
+   vlan_macip_lens |= (sizeof(*eh) << E1000_ADVTXD_MACLEN_SHIFT);
+   
+   switch (ntohs(eh->ether_type)) {
+   case ETHERTYPE_IP: {
+   struct ip *ip;
+
+   m = m_getptr(mp, sizeof(*eh), );
+   ip = (struct ip *)(mtod(m, caddr_t) + hoff);
+
+   iphlen = ip->ip_hl << 2;
+   ipproto = ip->ip_p;
+
+   type_tucmd_mlhl |= E1000_ADVTXD_TUCMD_IPV4;
+   if (ISSET(mp->m_pkthdr.csum_flags, M_IPV4_CSUM_OUT)) {
+   *olinfo_status |= E1000_TXD_POPTS_IXS

pf: DIOCXCOMMIT and copyin

2022-07-21 Thread Moritz Buhl
Hi tech,

for the other two DIOCX ioctls syzkaller showed that it is possible
to grab netlock while doing copyin.

The same problem should exist for DIOCXCOMMIT but syzkaller didn't
find it yet.

In case anybody can reproduce the witness lock order reversals the
syzkaller can produce, the diff below might address the problem.

mbuhl

Index: sys/net/pf_ioctl.c
===
RCS file: /cvs/src/sys/net/pf_ioctl.c,v
retrieving revision 1.383
diff -u -p -r1.383 pf_ioctl.c
--- sys/net/pf_ioctl.c  20 Jul 2022 09:33:11 -  1.383
+++ sys/net/pf_ioctl.c  20 Jul 2022 18:17:45 -
@@ -2621,13 +2621,9 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
}
ioe = malloc(sizeof(*ioe), M_TEMP, M_WAITOK);
table = malloc(sizeof(*table), M_TEMP, M_WAITOK);
-   NET_LOCK();
-   PF_LOCK();
/* first makes sure everything will succeed */
for (i = 0; i < io->size; i++) {
if (copyin(io->array+i, ioe, sizeof(*ioe))) {
-   PF_UNLOCK();
-   NET_UNLOCK();
free(table, M_TEMP, sizeof(*table));
free(ioe, M_TEMP, sizeof(*ioe));
error = EFAULT;
@@ -2635,13 +2631,13 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
}
if (strnlen(ioe->anchor, sizeof(ioe->anchor)) ==
sizeof(ioe->anchor)) {
-   PF_UNLOCK();
-   NET_UNLOCK();
free(table, M_TEMP, sizeof(*table));
free(ioe, M_TEMP, sizeof(*ioe));
error = ENAMETOOLONG;
goto fail;
}
+   NET_LOCK();
+   PF_LOCK();
switch (ioe->type) {
case PF_TRANS_TABLE:
rs = pf_find_ruleset(ioe->anchor);
@@ -2677,7 +2673,11 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
error = EINVAL;
goto fail;
}
+   PF_UNLOCK();
+   NET_UNLOCK();
}
+   NET_LOCK();
+   PF_LOCK();
 
/*
 * Checked already in DIOCSETLIMIT, but check again as the
@@ -2696,9 +2696,9 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
}
/* now do the commit - no errors should happen here */
for (i = 0; i < io->size; i++) {
+   PF_UNLOCK();
+   NET_UNLOCK();
if (copyin(io->array+i, ioe, sizeof(*ioe))) {
-   PF_UNLOCK();
-   NET_UNLOCK();
free(table, M_TEMP, sizeof(*table));
free(ioe, M_TEMP, sizeof(*ioe));
error = EFAULT;
@@ -2706,13 +2706,13 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
}
if (strnlen(ioe->anchor, sizeof(ioe->anchor)) ==
sizeof(ioe->anchor)) {
-   PF_UNLOCK();
-   NET_UNLOCK();
free(table, M_TEMP, sizeof(*table));
free(ioe, M_TEMP, sizeof(*ioe));
error = ENAMETOOLONG;
goto fail;
}
+   NET_LOCK();
+   PF_LOCK();
switch (ioe->type) {
case PF_TRANS_TABLE:
memset(table, 0, sizeof(*table));



Re: pf: pool for pf_anchor

2022-07-20 Thread Moritz Buhl
On Tue, Jul 19, 2022 at 10:58:08PM +0200, Alexander Bluhm wrote:
> On Tue, Jul 19, 2022 at 07:18:54PM +0200, Moritz Buhl wrote:
> > A solution would be to move the allocation of the pf_anchor struct
> > into a pool.  One final question would be regarding the size of the
> > hiwat or hardlimit.  Any suggestions?
> 
> 10 seems way to low.  We want to limit resource abuse.  But regular
> users should never hit this.  Especially existing configurations
> should still work after upgrade.
> 
> Perhaps 512 anchors.  Noone shoul ever need more than 512 anchors :-)

Sure.

> > Any other comments?
> 
> Please do not use PR_WAITOK in pool_init() unless you know what you
> are doing.  I think it should work, but who knows.  The other pf
> pools have 0 flags, just use that.

Ok.

> Comment says rs_malloc is needed to compile pfctl.  Have you tested
> that?

The new diff addresses this.  As well as documentation.

> Replacing M_WAITOK|M_CANFAIL|M_ZERO with PR_NOWAIT|PR_ZERO looks
> wrong.  PR_WAITOK|PR_LIMITFAIL|PR_ZERO should do something equivalent.

I don't know how I got this wrong.

> I am not sure if pf_find_anchor() should fail if the limit is hit.
> It uses only a temporary buffer.  All calls to pf_find_ruleset()
> should be checked whether permanent failure, after the limit has
> been reached, is what we want.  Or we keep the malloc there?

I guess I was overzealous replacing all mallocs for pf_anchor,
thanks for paying attention on the actual lifetime.

Is the following diff OK?

mbuhl


Index: sbin/pfctl/pfctl.c
===
RCS file: /mount/openbsd/cvs/src/sbin/pfctl/pfctl.c,v
retrieving revision 1.385
diff -u -p -r1.385 pfctl.c
--- sbin/pfctl/pfctl.c  11 Nov 2021 12:49:53 -  1.385
+++ sbin/pfctl/pfctl.c  20 Jul 2022 06:55:36 -
@@ -158,6 +158,7 @@ static const struct {
{ "tables", PF_LIMIT_TABLES },
{ "table-entries",  PF_LIMIT_TABLE_ENTRIES },
{ "pktdelay-pkts",  PF_LIMIT_PKTDELAY_PKTS },
+   { "anchors",PF_LIMIT_ANCHORS },
{ NULL, 0 }
 };
 
Index: share/man/man4/pf.4
===
RCS file: /mount/openbsd/cvs/src/share/man/man4/pf.4,v
retrieving revision 1.92
diff -u -p -r1.92 pf.4
--- share/man/man4/pf.4 26 May 2019 02:06:55 -  1.92
+++ share/man/man4/pf.4 20 Jul 2022 07:40:38 -
@@ -416,7 +416,7 @@ struct pfioc_limit {
 
 enum   { PF_LIMIT_STATES, PF_LIMIT_SRC_NODES, PF_LIMIT_FRAGS,
  PF_LIMIT_TABLES, PF_LIMIT_TABLE_ENTRIES, PF_LIMIT_PKTDELAY_PKTS,
- PF_LIMIT_MAX };
+ PF_LIMIT_ANCHORS, PF_LIMIT_MAX };
 .Ed
 .It Dv DIOCGETLIMIT Fa "struct pfioc_limit *pl"
 Get the hard
@@ -1033,6 +1033,7 @@ static const struct {
{ "frags",  PF_LIMIT_FRAGS },
{ "tables", PF_LIMIT_TABLES },
{ "table-entries",  PF_LIMIT_TABLE_ENTRIES },
+   { "anchors",PF_LIMIT_ANCHORS },
{ NULL, 0 }
 };
 
Index: sys/net/pf.c
===
RCS file: /mount/openbsd/cvs/src/sys/net/pf.c,v
retrieving revision 1.1135
diff -u -p -r1.1135 pf.c
--- sys/net/pf.c28 Jun 2022 13:48:06 -  1.1135
+++ sys/net/pf.c20 Jul 2022 06:59:37 -
@@ -276,7 +276,8 @@ struct pf_pool_limit pf_pool_limits[PF_L
{ _frent_pl, PFFRAG_FRENT_HIWAT, PFFRAG_FRENT_HIWAT },
{ _ktable_pl, PFR_KTABLE_HIWAT, PFR_KTABLE_HIWAT },
{ _kentry_pl, PFR_KENTRY_HIWAT, PFR_KENTRY_HIWAT },
-   { _pktdelay_pl, PF_PKTDELAY_MAXPKTS, PF_PKTDELAY_MAXPKTS }
+   { _pktdelay_pl, PF_PKTDELAY_MAXPKTS, PF_PKTDELAY_MAXPKTS },
+   { _anchor_pl, PF_ANCHOR_HIWAT, PF_ANCHOR_HIWAT }
 };
 
 #define BOUND_IFACE(r, k) \
Index: sys/net/pf_ioctl.c
===
RCS file: /mount/openbsd/cvs/src/sys/net/pf_ioctl.c,v
retrieving revision 1.382
diff -u -p -r1.382 pf_ioctl.c
--- sys/net/pf_ioctl.c  26 Jun 2022 11:37:08 -  1.382
+++ sys/net/pf_ioctl.c  20 Jul 2022 06:22:15 -
@@ -191,6 +191,8 @@ pfattach(int num)
IPL_SOFTNET, 0, "pftag", NULL);
pool_init(_pktdelay_pl, sizeof(struct pf_pktdelay), 0,
IPL_SOFTNET, 0, "pfpktdelay", NULL);
+   pool_init(_anchor_pl, sizeof(struct pf_anchor), 0,
+   IPL_SOFTNET, 0, "pfanchor", NULL);
 
hfsc_initialize();
pfr_initialize();
@@ -200,6 +202,8 @@ pfattach(int num)
 
pool_sethardlimit(pf_pool_limits[PF_LIMIT_STATES].pp,
pf_pool_limits[PF_LIMIT_STATES].limit, NULL, 0);
+   pool_sethardlimit(pf_pool_limits[PF_LIMIT_ANCHORS].pp,
+   pf_pool_limits[PF_LIMIT_ANCHORS].limit, NULL, 0);
 
if

pf: pool for pf_anchor

2022-07-19 Thread Moritz Buhl
Dear tech@,

I am investigating a syzkaller reproducer found in the
"no output from test machine (7)" crashes:
https://syzkaller.appspot.com/bug?id=d93e92fde3857c69df2cf46b4244d9814c4318a7
https://syzkaller.appspot.com/text?tag=ReproC=116ee2e008

The code calls DIOCXBEGIN, with different anchor names
until the malloc bucket for size 2048 fills up.
So the following reproducer does the same:

#include 
#include 
#include 

#include 
#include 
#include 
#include 

int calls;
void
iterate(char *str)
{
char *end = str + 64;
*end = '\0';
while (*str == '\255') {
*str = '\1';
str++;
}

if (str == end) return;

if (*str == '\0') *str = '\1';
*str = ++*str;
}

int
main(void)
{
struct pfioc_trans trans;
struct pfioc_trans_e elem;
int fd, ret;
char it[64] = "\0";

fd = open("/dev/pf", O_RDWR);

while (1) {
trans.size = 1;
trans.esize = 0x408;
trans.array = 
elem.type = 0;
iterate(it);
strcpy(elem.anchor, it);
elem.ticket = 0;
calls++;
if (ioctl(fd, DIOCXBEGIN, ) == -1)
err(1, "icoctl: %d calls, '%s'", calls, elem.anchor);
}
}

A solution would be to move the allocation of the pf_anchor struct
into a pool.  One final question would be regarding the size of the
hiwat or hardlimit.  Any suggestions?

Any other comments?

mbuhl


Index: sys/net/pf.c
===
RCS file: /mount/openbsd/cvs/src/sys/net/pf.c,v
retrieving revision 1.1135
diff -u -p -r1.1135 pf.c
--- sys/net/pf.c28 Jun 2022 13:48:06 -  1.1135
+++ sys/net/pf.c19 Jul 2022 11:34:23 -
@@ -269,6 +269,7 @@ void pf_log_matches(struct pf_pdesc 
*
 
 extern struct pool pfr_ktable_pl;
 extern struct pool pfr_kentry_pl;
+extern struct pool pf_anchor_pl;
 
 struct pf_pool_limit pf_pool_limits[PF_LIMIT_MAX] = {
{ _state_pl, PFSTATE_HIWAT, PFSTATE_HIWAT },
@@ -276,7 +277,8 @@ struct pf_pool_limit pf_pool_limits[PF_L
{ _frent_pl, PFFRAG_FRENT_HIWAT, PFFRAG_FRENT_HIWAT },
{ _ktable_pl, PFR_KTABLE_HIWAT, PFR_KTABLE_HIWAT },
{ _kentry_pl, PFR_KENTRY_HIWAT, PFR_KENTRY_HIWAT },
-   { _pktdelay_pl, PF_PKTDELAY_MAXPKTS, PF_PKTDELAY_MAXPKTS }
+   { _pktdelay_pl, PF_PKTDELAY_MAXPKTS, PF_PKTDELAY_MAXPKTS },
+   { _anchor_pl, PF_ANCHOR_HIWAT, PF_ANCHOR_HIWAT }
 };
 
 #define BOUND_IFACE(r, k) \
Index: sys/net/pf_ioctl.c
===
RCS file: /mount/openbsd/cvs/src/sys/net/pf_ioctl.c,v
retrieving revision 1.382
diff -u -p -r1.382 pf_ioctl.c
--- sys/net/pf_ioctl.c  26 Jun 2022 11:37:08 -  1.382
+++ sys/net/pf_ioctl.c  19 Jul 2022 16:56:23 -
@@ -191,6 +191,8 @@ pfattach(int num)
IPL_SOFTNET, 0, "pftag", NULL);
pool_init(_pktdelay_pl, sizeof(struct pf_pktdelay), 0,
IPL_SOFTNET, 0, "pfpktdelay", NULL);
+   pool_init(_anchor_pl, sizeof(struct pf_anchor), 0,
+   IPL_SOFTNET, PR_WAITOK, "pfanchor", NULL);
 
hfsc_initialize();
pfr_initialize();
@@ -200,6 +202,8 @@ pfattach(int num)
 
pool_sethardlimit(pf_pool_limits[PF_LIMIT_STATES].pp,
pf_pool_limits[PF_LIMIT_STATES].limit, NULL, 0);
+   pool_sethardlimit(pf_pool_limits[PF_LIMIT_ANCHORS].pp,
+   pf_pool_limits[PF_LIMIT_ANCHORS].limit, NULL, 0);
 
if (physmem <= atop(100*1024*1024))
pf_pool_limits[PF_LIMIT_TABLE_ENTRIES].limit =
Index: sys/net/pf_ruleset.c
===
RCS file: /mount/openbsd/cvs/src/sys/net/pf_ruleset.c,v
retrieving revision 1.18
diff -u -p -r1.18 pf_ruleset.c
--- sys/net/pf_ruleset.c27 Dec 2018 16:54:01 -  1.18
+++ sys/net/pf_ruleset.c19 Jul 2022 12:56:35 -
@@ -40,6 +40,7 @@
 #ifdef _KERNEL
 #include 
 #include 
+#include 
 #endif /* _KERNEL */
 #include 
 
@@ -55,6 +56,7 @@
 #endif /* INET6 */
 
 
+struct poolpf_anchor_pl;
 #ifdef _KERNEL
 #define rs_malloc(x)   malloc(x, M_TEMP, M_WAITOK|M_CANFAIL|M_ZERO)
 #define rs_free(x, siz)free(x, M_TEMP, siz)
@@ -107,12 +109,12 @@ pf_find_anchor(const char *path)
 {
struct pf_anchor*key, *found;
 
-   key = rs_malloc(sizeof(*key));
+   key = pool_get(_anchor_pl, PR_NOWAIT | PR_ZERO);
if (key == NULL)
return (NULL);
strlcpy(key->path, path, sizeof(key->path));
found = RB_FIND(pf_anchor_global, _anchors, key);
-   rs_free(key, sizeof(*key));
+   pool_put(_anchor_pl, key);
return (found);
 }
 
@@ -184,7 +186,7 @@ pf_create_anchor(struct pf_anchor *paren
((parent != NULL) && (strlen(parent->path) >= PF_ANCHOR_MAXPATH)))
return 

regress/lib/libm/rint on armv7

2022-07-15 Thread Moritz Buhl
Hi tech@, Dear miod@, Dear kettenis@,

On arm the lib/libm/rint regress test is failing. After adding some
printf debugging into libm it turns out that the problem is the
cast from double to long long.
http://bluhm.genua.de/regress/results/2022-07-12T04%3A17%3A03Z/logs/lib/libm/rint/make.log:

 run-regress-rint 
cc -O2 -pipe   -MD -MP  -c /usr/src/regress/lib/libm/rint/rint.c
cc   -o rint rint.o -lm
./rint
assertion "llrintf(4503599627370496.0F) == 4503599627370496LL"
failed: file "/usr/src/regress/lib/libm/rint/rint.c", line 51,
function "main" *** Signal SIGABRT in . (:36
'run-regress-rint') FAILED


After discussing this with kettenis@, he suggested copying the
ieee754 implementations from NetBSD.

The only modification I find worth mentioning is in fixdfdi.c:
if (exp >= 63)
the original code checked for 62, but this fails the regression test too:
assertion "llrint(0x7c00.0p0) == 0x7c00LL"
failed: file "/usr/src/regress/lib/libm/rint/rint.c", line 55,
function "main"

Any thoughts on this?

mbuhl

Index: lib/libc/arch/arm/quad/fixdfdi.c
===
RCS file: lib/libc/arch/arm/quad/fixdfdi.c
diff -N lib/libc/arch/arm/quad/fixdfdi.c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ lib/libc/arch/arm/quad/fixdfdi.c15 Jul 2022 10:57:41 -
@@ -0,0 +1,83 @@
+/* $OpenBSD$   */
+/* $NetBSD: fixdfdi_ieee754.c,v 1.1 2013/08/24 00:51:48 matt Exp $ */
+
+/*-
+ * Copyright (c) 1992, 1993
+ * The Regents of the University of California.  All rights reserved.
+ *
+ * This software was developed by the Computer Systems Engineering group
+ * at Lawrence Berkeley Laboratory under DARPA contract BG 91-66 and
+ * contributed to Berkeley.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the University nor the names of its contributors
+ *may be used to endorse or promote products derived from this software
+ *without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include 
+#if defined(LIBC_SCCS) && !defined(lint)
+__RCSID("$NetBSD: fixdfdi_ieee754.c,v 1.1 2013/08/24 00:51:48 matt Exp $");
+#endif /* LIBC_SCCS and not lint */
+
+#if defined(SOFTFLOAT) || defined(__ARM_EABI__)
+#include "softfloat/softfloat-for-gcc.h"
+#endif
+
+#include "../../../quad/quad.h"
+#include 
+#include 
+#include 
+
+/*
+ * Convert double to signed quad.
+ * Not sure what to do with negative numbers---for now, anything out
+ * of range becomes UQUAD_MAX.
+ */
+quad_t
+__fixdfdi(double x)
+{
+   struct ieee_double ux = *(struct ieee_double *)
+   signed int exp = ux.dbl_exp - DBL_EXP_BIAS;
+   const bool neg = ux.dbl_sign;
+   quad_t r;
+
+   if (exp >= 63)
+   return neg ? QUAD_MIN : QUAD_MAX;
+
+   r = 1 << DBL_FRACHBITS; /* implicit bit */
+   r |= ux.dbl_frach;
+   exp -= DBL_FRACHBITS;
+   if (exp < 0) {
+   r >>= -exp;
+   } else if (exp > 0) {
+   r <<= DBL_FRACLBITS;
+   r |= ux.dbl_fracl;
+   exp -= DBL_FRACLBITS;
+   if (exp < 0) {
+   r >>= -exp;
+   } else if (exp > 0) {
+   r <<= exp;
+   }
+   }
+   return neg ? -r : r;
+}
Index: lib/libc/arch/arm/quad/fixsfdi.c
===
RCS file: lib/libc/arch/arm/quad/fixsfdi.c
diff -N lib/libc/arch/arm/quad/fixsfdi.c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ lib/libc/arch/arm/quad/fixsfdi.c15 Jul 2022 10:56:06 -
@@ -0,0 +1,76 @@
+/* $OpenBSD$   */
+/* $NetBSD: fixsfdi_ieee754.c,v 1.1 2013/08/24 00:51:48 matt Exp $ */
+
+/*-
+ * Copyright (c) 1992, 

Re: checksum offloading for em

2022-06-26 Thread Moritz Buhl
On Sun, Jun 26, 2022 at 12:23:58PM +0200, Moritz Buhl wrote:
> Hi,
> 
> I noticed that for some offloading-capable em controllers checksum
> offloading is still disabled and I couldn't find a reason for that.
> 
> It was assumed during initial implementation:
> https://marc.info/?l=openbsd-tech=137875739832438=2
> 
> According to the datasheets of the i350 controllers listed below,
> they do support the usual offloading features.
> https://www.intel.com/content/www/us/en/products/details/ethernet/gigabit-controllers/i350-controllers.html
> 
> tested with
> em0 at pci3 dev 0 function 0 "Intel I350" rev 0x01: msi

I made a mistake during testing. The diff doesn't work for i350.



checksum offloading for em

2022-06-26 Thread Moritz Buhl
Hi,

I noticed that for some offloading-capable em controllers checksum
offloading is still disabled and I couldn't find a reason for that.

It was assumed during initial implementation:
https://marc.info/?l=openbsd-tech=137875739832438=2

According to the datasheets of the i350 controllers listed below,
they do support the usual offloading features.
https://www.intel.com/content/www/us/en/products/details/ethernet/gigabit-controllers/i350-controllers.html

tested with
em0 at pci3 dev 0 function 0 "Intel I350" rev 0x01: msi
and
em0 at pci0 dev 31 function 6 "Intel I219-LM" rev 0x21: msi

I would appreciate any further feedback, discussion and especially testing.

mbuhl


Index: sys/dev/pci/if_em.c
===
RCS file: /cvs/src/sys/dev/pci/if_em.c,v
retrieving revision 1.361
diff -u -p -r1.361 if_em.c
--- sys/dev/pci/if_em.c 11 Mar 2022 18:00:45 -  1.361
+++ sys/dev/pci/if_em.c 24 May 2022 08:47:41 -
@@ -1217,9 +1217,7 @@ em_encap(struct em_queue *que, struct mb
}
 
if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350) {
+   sc->hw.mac_type != em_82576 && sc->hw.mac_type != em_82580) {
used += em_transmit_checksum_setup(que, m, head,
_upper, _lower);
} else {
@@ -1966,9 +1964,7 @@ em_setup_interface(struct em_softc *sc)
 #endif
 
if (sc->hw.mac_type >= em_82543 && sc->hw.mac_type != em_82575 &&
-   sc->hw.mac_type != em_82576 &&
-   sc->hw.mac_type != em_82580 && sc->hw.mac_type != em_i210 &&
-   sc->hw.mac_type != em_i350)
+   sc->hw.mac_type != em_82576 && sc->hw.mac_type != em_82580)
ifp->if_capabilities |= IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4;
 
/* 



Re: another syzkaller problem in pf

2022-06-26 Thread Moritz Buhl
On Tue, May 03, 2022 at 09:31:51PM +0200, Alexander Bluhm wrote:
...
> The code is too complex to be sure what the reason of the syzkaller
> panic is.  Sleep in malloc is correct anyway and may improve the
> situation.
> 
> Functions with argument values 0 or 1 are hard to read.  It would
> be much nicer to pass M_WAITOK or M_NOWAIT.  And the variable name
> "intr" does not make sense anymore.  pf does not run in interrupt
> context.  Call it "mflags" like in pfi_kif_alloc().  Or "wait" like
> in other functions.
> 
> Could you cleanup that also?

I renamed the intr parameter for pfr_create_ktable and pfr_attach_table
and passed it further up the call stack.
In pf_addr_setup I noticed that pf_tbladdr_setup also used to set intr.
I changed it to wait for both, pfi_dynaddr_setup and pf_tbladdr_setup.

Now it should be a little easier to notice sleeping points in pf_ioctl.c.

OK?

mbuhl

Index: sys/net/pf.c
===
RCS file: /cvs/src/sys/net/pf.c,v
retrieving revision 1.1133
diff -u -p -r1.1133 pf.c
--- sys/net/pf.c13 Jun 2022 12:48:00 -  1.1133
+++ sys/net/pf.c26 Jun 2022 09:21:21 -
@@ -1585,11 +1585,11 @@ pf_purge_expired_states(u_int32_t maxche
 }
 
 int
-pf_tbladdr_setup(struct pf_ruleset *rs, struct pf_addr_wrap *aw)
+pf_tbladdr_setup(struct pf_ruleset *rs, struct pf_addr_wrap *aw, int wait)
 {
if (aw->type != PF_ADDR_TABLE)
return (0);
-   if ((aw->p.tbl = pfr_attach_table(rs, aw->v.tblname, 1)) == NULL)
+   if ((aw->p.tbl = pfr_attach_table(rs, aw->v.tblname, wait)) == NULL)
return (1);
return (0);
 }
Index: sys/net/pf_if.c
===
RCS file: /cvs/src/sys/net/pf_if.c,v
retrieving revision 1.105
diff -u -p -r1.105 pf_if.c
--- sys/net/pf_if.c 16 May 2022 13:31:19 -  1.105
+++ sys/net/pf_if.c 26 Jun 2022 09:38:43 -
@@ -423,7 +423,7 @@ pfi_match_addr(struct pfi_dynaddr *dyn, 
 }
 
 int
-pfi_dynaddr_setup(struct pf_addr_wrap *aw, sa_family_t af)
+pfi_dynaddr_setup(struct pf_addr_wrap *aw, sa_family_t af, int wait)
 {
struct pfi_dynaddr  *dyn;
char tblname[PF_TABLE_NAME_SIZE];
@@ -432,8 +432,7 @@ pfi_dynaddr_setup(struct pf_addr_wrap *a
 
if (aw->type != PF_ADDR_DYNIFTL)
return (0);
-   if ((dyn = pool_get(_addr_pl, PR_WAITOK | PR_LIMITFAIL | PR_ZERO))
-   == NULL)
+   if ((dyn = pool_get(_addr_pl, wait|PR_LIMITFAIL|PR_ZERO)) == NULL)
return (1);
 
if (!strcmp(aw->v.ifname, "self"))
@@ -466,7 +465,7 @@ pfi_dynaddr_setup(struct pf_addr_wrap *a
goto _bad;
}
 
-   if ((dyn->pfid_kt = pfr_attach_table(ruleset, tblname, 1)) == NULL) {
+   if ((dyn->pfid_kt = pfr_attach_table(ruleset, tblname, wait)) == NULL) {
rv = 1;
goto _bad;
}
Index: sys/net/pf_ioctl.c
===
RCS file: /cvs/src/sys/net/pf_ioctl.c,v
retrieving revision 1.381
diff -u -p -r1.381 pf_ioctl.c
--- sys/net/pf_ioctl.c  10 May 2022 23:12:25 -  1.381
+++ sys/net/pf_ioctl.c  26 Jun 2022 09:35:17 -
@@ -891,8 +891,8 @@ int
 pf_addr_setup(struct pf_ruleset *ruleset, struct pf_addr_wrap *addr,
 sa_family_t af)
 {
-   if (pfi_dynaddr_setup(addr, af) ||
-   pf_tbladdr_setup(ruleset, addr) ||
+   if (pfi_dynaddr_setup(addr, af, PR_WAITOK) ||
+   pf_tbladdr_setup(ruleset, addr, PR_WAITOK) ||
pf_rtlabel_add(addr))
return (EINVAL);
 
@@ -1413,7 +1413,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
 
if (rule->overload_tblname[0]) {
if ((rule->overload_tbl = pfr_attach_table(ruleset,
-   rule->overload_tblname, 0)) == NULL)
+   rule->overload_tblname, PR_WAITOK)) == NULL)
error = EINVAL;
else
rule->overload_tbl->pfrkt_flags |= 
PFR_TFLAG_ACTIVE;
@@ -1636,7 +1636,8 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
 
if (newrule->overload_tblname[0]) {
newrule->overload_tbl = pfr_attach_table(
-   ruleset, newrule->overload_tblname, 0);
+   ruleset, newrule->overload_tblname,
+   PR_WAITOK);
if (newrule->overload_tbl == NULL)
error = EINVAL;
else
Index: sys/net/pf_table.c
===
RCS file: /cvs/src/sys/net/pf_table.c,v
retrieving revision 1.142
diff -u -p -r1.142 pf_table.c
--- sys/net/pf_table.c  16 Jun 2022 20:47:26 -  1.142
+++ sys/net/pf_table.c  26 Jun 2022 

igc vlan hwtagging

2022-06-02 Thread Moritz Buhl
Dear tech@,

the following diff should implement vlan hwtagging for igc.
I would appreciate feedback from further testing.

OK?
mbuhl

Index: sys/dev/pci/if_igc.c
===
RCS file: /cvs/src/sys/dev/pci/if_igc.c,v
retrieving revision 1.9
diff -u -p -r1.9 if_igc.c
--- sys/dev/pci/if_igc.c2 Jun 2022 07:41:17 -   1.9
+++ sys/dev/pci/if_igc.c2 Jun 2022 12:18:14 -
@@ -790,11 +790,9 @@ igc_setup_interface(struct igc_softc *sc
 
ifp->if_capabilities = IFCAP_VLAN_MTU;
 
-#ifdef notyet
 #if NVLAN > 0
ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING;
 #endif
-#endif
 
ifp->if_capabilities |= IFCAP_CSUM_IPv4;
ifp->if_capabilities |= IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4;
@@ -1007,17 +1005,19 @@ igc_start(struct ifqueue *ifq)
prod &= mask;
}
 
+   cmd_type_len = IGC_ADVTXD_DCMD_IFCS | IGC_ADVTXD_DTYP_DATA |
+   IGC_ADVTXD_DCMD_DEXT;
+
+   if (ISSET(m->m_flags, M_VLANTAG))
+   cmd_type_len |= IGC_ADVTXD_DCMD_VLE;
+
for (i = 0; i < map->dm_nsegs; i++) {
txdesc = >tx_base[prod];
 
-   cmd_type_len = IGC_ADVTXD_DCMD_IFCS | 
IGC_ADVTXD_DTYP_DATA |
-   IGC_ADVTXD_DCMD_DEXT | map->dm_segs[i].ds_len;
-   if (i == map->dm_nsegs - 1)
-   cmd_type_len |= IGC_ADVTXD_DCMD_EOP |
-   IGC_ADVTXD_DCMD_RS;
-
htolem64(>read.buffer_addr, 
map->dm_segs[i].ds_addr);
-   htolem32(>read.cmd_type_len, cmd_type_len);
+   htolem32(>read.cmd_type_len, cmd_type_len |
+   map->dm_segs[i].ds_len | ((i == map->dm_nsegs - 1)?
+   IGC_ADVTXD_DCMD_EOP | IGC_ADVTXD_DCMD_RS : 0));
htolem32(>read.olinfo_status, olinfo_status);
 
last = prod;
@@ -2006,10 +2006,10 @@ igc_tx_ctx_setup(struct tx_ring *txr, st
struct mbuf *m;
uint32_t type_tucmd_mlhl = 0;
uint32_t vlan_macip_lens = 0;
-   uint32_t iphlen;
+   uint32_t iphlen = 0;
int hoff;
int off = 0;
-   uint8_t ipproto;
+   uint8_t ipproto = 0;
 
vlan_macip_lens |= (sizeof(*eh) << IGC_ADVTXD_MACLEN_SHIFT);
 
@@ -2018,7 +2018,6 @@ igc_tx_ctx_setup(struct tx_ring *txr, st
 * be placed into the context descriptor. Hence
 * we need to make one even if not doing offloads.
 */
-#ifdef notyet
 #if NVLAN > 0
if (ISSET(mp->m_flags, M_VLANTAG)) {
uint32_t vtag = mp->m_pkthdr.ether_vtag;
@@ -2026,7 +2025,6 @@ igc_tx_ctx_setup(struct tx_ring *txr, st
off = 1;
}
 #endif
-#endif
 
switch (ntohs(eh->ether_type)) {
case ETHERTYPE_IP: {
@@ -2062,8 +2060,6 @@ igc_tx_ctx_setup(struct tx_ring *txr, st
break;
}
 #endif
-   default:
-   return 0;
}
 
vlan_macip_lens |= iphlen;



Re: igc checksum offloading

2022-05-25 Thread Moritz Buhl
On Wed, May 25, 2022 at 04:18:00PM +0200, Moritz Buhl wrote:
> Dear tech@,
> 
> The attached diff should handle checksum offloading for igc(4)
> for IPv4, TCP, and UDP.
> 
> I tested with "Intel I225-LM" rev 0x03,
> kevlo@ tested this with "Intel I225-V" rev 0x03
> 
> OK?

The diff doesn't look right. Try again.


Index: sys/dev/pci/if_igc.c
===
RCS file: /cvs/src/sys/dev/pci/if_igc.c,v
retrieving revision 1.8
diff -u -p -r1.8 if_igc.c
--- sys/dev/pci/if_igc.c11 May 2022 06:14:15 -  1.8
+++ sys/dev/pci/if_igc.c25 May 2022 18:44:39 -
@@ -48,6 +48,8 @@
 
 #include 
 #include 
+#include 
+#include 
 
 #if NBPFILTER > 0
 #include 
@@ -115,6 +117,7 @@ int igc_media_change(struct ifnet *);
 void   igc_iff(struct igc_softc *);
 void   igc_update_link_status(struct igc_softc *);
 intigc_get_buf(struct rx_ring *, int);
+intigc_tx_ctx_setup(struct tx_ring *, struct mbuf *, int, uint32_t *);
 
 void   igc_configure_queues(struct igc_softc *);
 void   igc_set_queues(struct igc_softc *, uint32_t, uint32_t, int);
@@ -791,9 +794,11 @@ igc_setup_interface(struct igc_softc *sc
 #if NVLAN > 0
ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING;
 #endif
+#endif
 
+   ifp->if_capabilities |= IFCAP_CSUM_IPv4;
ifp->if_capabilities |= IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4;
-#endif
+   ifp->if_capabilities |= IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
 
/* Initialize ifmedia structures. */
ifmedia_init(>media, IFM_IMASK, igc_media_change, igc_media_status);
@@ -996,6 +1001,12 @@ igc_start(struct ifqueue *ifq)
bus_dmamap_sync(txr->txdma.dma_tag, map, 0,
map->dm_mapsize, BUS_DMASYNC_PREWRITE);
 
+   if (igc_tx_ctx_setup(txr, m, prod, _status)) {
+   /* Consume the first descriptor */
+   prod++;
+   prod &= mask;
+   }
+
for (i = 0; i < map->dm_nsegs; i++) {
txdesc = >tx_base[prod];
 
@@ -1977,6 +1988,117 @@ igc_free_transmit_buffers(struct tx_ring
sc->num_tx_desc * sizeof(struct igc_tx_buf));
txr->tx_buffers = NULL;
txr->txtag = NULL;
+}
+
+
+/*
+ *
+ *  Advanced Context Descriptor setup for VLAN, CSUM or TSO
+ *
+ **/
+
+int
+igc_tx_ctx_setup(struct tx_ring *txr, struct mbuf *mp, int prod,
+uint32_t *olinfo_status)
+{
+   struct igc_adv_tx_context_desc *txdesc;
+   struct ether_header *eh = mtod(mp, struct ether_header *);
+   struct mbuf *m;
+   uint32_t type_tucmd_mlhl = 0;
+   uint32_t vlan_macip_lens = 0;
+   uint32_t iphlen;
+   int hoff;
+   int off = 0;
+   uint8_t ipproto;
+
+   vlan_macip_lens |= (sizeof(*eh) << IGC_ADVTXD_MACLEN_SHIFT);
+
+   /*
+* In advanced descriptors the vlan tag must
+* be placed into the context descriptor. Hence
+* we need to make one even if not doing offloads.
+*/
+#ifdef notyet
+#if NVLAN > 0
+   if (ISSET(mp->m_flags, M_VLANTAG)) {
+   uint32_t vtag = mp->m_pkthdr.ether_vtag;
+   vlan_macip_lens |= (vtag << IGC_ADVTXD_VLAN_SHIFT);
+   off = 1;
+   }
+#endif
+#endif
+
+   switch (ntohs(eh->ether_type)) {
+   case ETHERTYPE_IP: {
+   struct ip *ip;
+
+   m = m_getptr(mp, sizeof(*eh), );
+   KASSERT(m != NULL && m->m_len - hoff >= sizeof(*ip));
+   ip = (struct ip *)(mtod(m, caddr_t) + hoff);
+
+   iphlen = ip->ip_hl << 2;
+   ipproto = ip->ip_p;
+
+   type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_IPV4;
+   if (ISSET(mp->m_pkthdr.csum_flags, M_IPV4_CSUM_OUT)) {
+   *olinfo_status |= IGC_TXD_POPTS_IXSM << 8;
+   off = 1;
+   }
+
+   break;
+   }
+#ifdef INET6
+   case ETHERTYPE_IPV6: {
+   struct ip6_hdr *ip6;
+
+   m = m_getptr(mp, sizeof(*eh), );
+   KASSERT(m != NULL && m->m_len - hoff >= sizeof(*ip6));
+   ip6 = (struct ip6_hdr *)(mtod(m, caddr_t) + hoff);
+
+   iphlen = sizeof(*ip6);
+   ipproto = ip6->ip6_nxt;
+
+   type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_IPV6;
+   break;
+   }
+#endif
+   default:
+   return 0;
+   }
+
+   vlan_macip_lens |= iphlen;
+   type_tucmd_mlhl |= IGC_ADVTXD_DCMD_DEXT | IGC_ADVTXD_DTYP_CTXT;
+
+   switch (ipproto) {
+   case IPPROTO_TCP:
+   type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_L4T_T

igc checksum offloading

2022-05-25 Thread Moritz Buhl
Dear tech@,

The attached diff should handle checksum offloading for igc(4)
for IPv4, TCP, and UDP.

I tested with "Intel I225-LM" rev 0x03,
kevlo@ tested this with "Intel I225-V" rev 0x03

OK?

Index: /sys/dev/pci/if_igc.c
===
RCS file: /mount/openbsd/cvs/src/sys/dev/pci/if_igc.c,v
retrieving revision 1.8
diff -u -p -r1.8 if_igc.c
--- /sys/dev/pci/if_igc.c   11 May 2022 06:14:15 -  1.8
+++ /sys/dev/pci/if_igc.c   20 May 2022 18:21:29 -
@@ -48,6 +48,8 @@
 
 #include 
 #include 
+#include 
+#include 
 
 #if NBPFILTER > 0
 #include 
@@ -115,6 +117,7 @@ int igc_media_change(struct ifnet *);
 void   igc_iff(struct igc_softc *);
 void   igc_update_link_status(struct igc_softc *);
 intigc_get_buf(struct rx_ring *, int);
+intigc_tx_ctx_setup(struct tx_ring *, struct mbuf *, int, uint32_t *);
 
 void   igc_configure_queues(struct igc_softc *);
 void   igc_set_queues(struct igc_softc *, uint32_t, uint32_t, int);
@@ -791,9 +794,11 @@ igc_setup_interface(struct igc_softc *sc
 #if NVLAN > 0
ifp->if_capabilities |= IFCAP_VLAN_HWTAGGING;
 #endif
+#endif
 
+   ifp->if_capabilities |= IFCAP_CSUM_IPv4;
ifp->if_capabilities |= IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4;
-#endif
+   ifp->if_capabilities |= IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
 
/* Initialize ifmedia structures. */
ifmedia_init(>media, IFM_IMASK, igc_media_change, igc_media_status);
@@ -996,6 +1001,12 @@ igc_start(struct ifqueue *ifq)
bus_dmamap_sync(txr->txdma.dma_tag, map, 0,
map->dm_mapsize, BUS_DMASYNC_PREWRITE);
 
+   if (igc_tx_ctx_setup(txr, m, prod, _status)) {
+   /* Consume the first descriptor */
+   prod++;
+   prod &= mask;
+   }
+
for (i = 0; i < map->dm_nsegs; i++) {
txdesc = >tx_base[prod];
 
@@ -1977,6 +1988,117 @@ igc_free_transmit_buffers(struct tx_ring
sc->num_tx_desc * sizeof(struct igc_tx_buf));
txr->tx_buffers = NULL;
txr->txtag = NULL;
+}
+
+
+/*
+ *
+ *  Advanced Context Descriptor setup for VLAN, CSUM or TSO
+ *
+ **/
+
+int
+igc_tx_ctx_setup(struct tx_ring *txr, struct mbuf *mp, int prod,
+uint32_t *olinfo_status)
+{
+   struct igc_adv_tx_context_desc *txdesc;
+   struct ether_header *eh = mtod(mp, struct ether_header *);
+   struct mbuf *m;
+   uint32_t type_tucmd_mlhl = 0;
+   uint32_t vlan_macip_lens = 0;
+   uint32_t iphlen;
+   int hoff;
+   int off = 0;
+   uint8_t ipproto;
+
+   vlan_macip_lens |= (sizeof(*eh) << IGC_ADVTXD_MACLEN_SHIFT);
+
+   /*
+* In advanced descriptors the vlan tag must
+* be placed into the context descriptor. Hence
+* we need to make one even if not doing offloads.
+*/
+#ifdef notyet
+#if NVLAN > 0
+   if (ISSET(mp->m_flags, M_VLANTAG)) {
+   uint32_t vtag = mp->m_pkthdr.ether_vtag;
+   vlan_macip_lens |= (vtag << IGC_ADVTXD_VLAN_SHIFT);
+   off = 1;
+   }
+#endif
+#endif
+
+   switch (ntohs(eh->ether_type)) {
+   case ETHERTYPE_IP: {
+   struct ip *ip;
+
+   m = m_getptr(mp, sizeof(*eh), );
+   KASSERT(m != NULL && m->m_len - hoff >= sizeof(*ip));
+   ip = (struct ip *)(mtod(m, caddr_t) + hoff);
+
+   iphlen = ip->ip_hl << 2;
+   ipproto = ip->ip_p;
+
+   type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_IPV4;
+   if (ISSET(mp->m_pkthdr.csum_flags, M_IPV4_CSUM_OUT)) {
+   *olinfo_status |= IGC_TXD_POPTS_IXSM << 8;
+   off = 1;
+   }
+
+   break;
+   }
+#ifdef INET6
+   case ETHERTYPE_IPV6: {
+   struct ip6_hdr *ip6;
+
+   m = m_getptr(mp, sizeof(*eh), );
+   KASSERT(m != NULL && m->m_len - hoff >= sizeof(*ip6));
+   ip6 = (struct ip6_hdr *)(mtod(m, caddr_t) + hoff);
+
+   iphlen = sizeof(*ip6);
+   ipproto = ip6->ip6_nxt;
+
+   type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_IPV6;
+   break;
+   }
+#endif
+   default:
+   return 0;
+   }
+
+   vlan_macip_lens |= iphlen;
+   type_tucmd_mlhl |= IGC_ADVTXD_DCMD_DEXT | IGC_ADVTXD_DTYP_CTXT;
+
+   switch (ipproto) {
+   case IPPROTO_TCP:
+   type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_L4T_TCP;
+   if (ISSET(mp->m_pkthdr.csum_flags, M_TCP_CSUM_OUT)) {
+   *olinfo_status |= IGC_TXD_POPTS_TXSM << 8;
+   off = 1;
+   }
+   break;
+   case IPPROTO_UDP:
+   type_tucmd_mlhl |= IGC_ADVTXD_TUCMD_L4T_UDP;
+   

another syzkaller problem in pf

2022-05-03 Thread Moritz Buhl
Hi tech@,

Syzkaller found a few crashes in pf_anchor_global_RB_REMOVE.
https://syzkaller.appspot.com/bug?id=a97f712331903ce38b8c084a489818b9bb5c6fcb
and also
https://syzkaller.appspot.com/text?tag=CrashLog=15ace9aaf0

The call stack is something like this:
pf_anchor_global_RB_REMOVE
pf_remove_if_empty_ruleset
pfi_dynaddr_setup
pfioctl

I believe this is caused by two calls to pf_remove_if_empty_ruleset
in pfi_dynaddr_setup and a possibly pfr_destroy_ktable.

I think it is a problem that pfr_attach_table is being called
with 1 (intr) because the introducing commit says:

commit 4b3977248902c22d96aaebdb5784840debc2631c
Author: mikeb 
Date:   Mon Nov 24 13:22:09 2008 +

Fix splasserts seen in pr 5987 by propagating a flag that discribes
whether we're called from the interrupt context to the functions
performing allocations.

Looked at by mpf@ and henning@, tested by mpf@ and Antti Harri,
the pr originator.

ok tedu

And we are not in interrupt context. A detailed analysis follows.


The pfioctl cd60441a is DIOCADDRULE.

The call stack fails in pf_addr_setup calls from pf_ioctl.c:
1647 if (pf_addr_setup(ruleset, >src.addr,
1648 newrule->af))
similar calls below.


One of these calls might do the following:
424 pfi_dynaddr_setup(struct pf_addr_wrap *aw, sa_family_t af)
...
462 if ((ruleset = pf_find_or_create_ruleset(PF_RESERVED_ANCHOR)) == 
NULL) {
assume false.
...
467 if ((dyn->pfid_kt = pfr_attach_table(ruleset, tblname, 1)) == NULL)
might evaluate to true, I will share my thoughts on it.
468 rv = 1;
469 goto _bad;
...
481 _bad:
482 if (dyn->pfid_kt != NULL)
483 pfr_detach_table(dyn->pfid_kt);
484 if (ruleset != NULL)
485 pf_remove_if_empty_ruleset(ruleset);
remember this call. I will show a case where we don't want this.

due to the following in pf_table.c:
2424 pfr_attach_table(struct pf_ruleset *rs, char *name, int intr)
...
2434 kt = pfr_lookup_table();
2435 if (kt == NULL) {
assume true.
2436 kt = pfr_create_ktable(, gettime(), 1, intr);
2437 if (kt == NULL)
2438 return (NULL);
2439 if (ac != NULL) {
assume true.
this means kt != NULL and kt->pfrkt_rs = our ruleset.
2440 bzero(tbl.pfrt_anchor, sizeof(tbl.pfrt_anchor));
2441 rt = pfr_lookup_table();
2442 if (rt == NULL) {
assume true.
pfr_ktable_compares pfrkt_name an pfrkt_anchor if previous was same.
pfrkt_name should be '\0', pfrkt_anchor is PF_RESERVED_ANCHOR vs '\0'
2443 rt = pfr_create_ktable(, 0, 1, intr);
let's look into this call next just to be sure.
2444 if (rt == NULL) {
assume true.
2445 pfr_destroy_ktable(kt, 0);
remember this call is possible. We look at this afterwards.
2446 return (NULL);
...

looking into create_ktable, I want to see if we can reach pfr_destroy_ktable,
still in pf_table.c:
2202 pfr_create_ktable(struct pfr_table *tbl, time_t tzero, int attachruleset
...
2208 if (intr)
2209 kt = pool_get(_ktable_pl, 
PR_NOWAIT|PR_ZERO|PR_LIMITFAIL);
This can fail. Returning NULL afterwards.
Note PR_ZERO.
...
2216 if (attachruleset) {
2217 rs = pf_find_or_create_ruleset(tbl->pfrt_anchor);
2218 if (!rs) {
not reachable because pfrt_anchor comes from l.2433.
But a weird indirection for our current call stack anyway. Might be necessary.
2219 pfr_destroy_ktable(kt, 0);
2220 return (NULL);
2221 }
 kt->pfrkt_rs = rs;
2223 rs->tables++;
2224 }
2225
2226 if (!rn_inithead((void **)>pfrkt_ip4,
2227 offsetof(struct sockaddr_in, sin_addr)) ||
2228 !rn_inithead((void **)>pfrkt_ip6,
2229 offsetof(struct sockaddr_in6, sin6_addr))) {
unreachable because >pfrkt_ip{4,6} are NULL.
2230 pfr_destroy_ktable(kt, 0);
2231 return (NULL);
2232 }

Back to pfr_destroy_ktable, still in pf_table.c:
2253 pfr_destroy_ktable(struct pfr_ktable *kt, int flushaddr)
2254 {
...
2268 if (kt->pfrkt_rs != NULL) {
assume true.
2269 kt->pfrkt_rs->tables--;
2270 pf_remove_if_empty_ruleset(kt->pfrkt_rs);
our first call to pf_remove_if_empty_ruleset.
The second one follows from pfi_dynaddr_setup.

If this is sound, then the only reason why pfr_destroy_ktable was called
is that pool_get is called with PR_NOWAIT.  And then the following diff
would help.

mbuhl


Index: pf_if.c
===
RCS file: /cvs/src/sys/net/pf_if.c,v
retrieving revision 1.104
diff -u -p 

another copyout while holding net/pf lock in pfioctl

2022-04-28 Thread Moritz Buhl
Hi tech@,

syzkaller reported pfi_get_interfaces
https://syzkaller.appspot.com/bug?id=963ef41930bb84af584be3bf910a3f67d65581a0
which does a copyout while holding NET and PF lock.

The following diff should address this.

mbuhl


Index: sys/net/pf_if.c
===
RCS file: /cvs/src/sys/net/pf_if.c,v
retrieving revision 1.103
diff -u -p -r1.103 pf_if.c
--- sys/net/pf_if.c 26 Dec 2021 01:00:32 -  1.103
+++ sys/net/pf_if.c 28 Apr 2022 16:36:49 -
@@ -755,7 +755,7 @@ pfi_update_status(const char *name, stru
}
 }
 
-int
+void
 pfi_get_ifaces(const char *name, struct pfi_kif *buf, int *size)
 {
struct pfi_kif  *p, *nextp;
@@ -769,16 +769,12 @@ pfi_get_ifaces(const char *name, struct 
if (!p->pfik_tzero)
p->pfik_tzero = gettime();
pfi_kif_ref(p, PFI_KIF_REF_RULE);
-   if (copyout(p, buf++, sizeof(*buf))) {
-   pfi_kif_unref(p, PFI_KIF_REF_RULE);
-   return (EFAULT);
-   }
+   memcpy(buf++, p, sizeof(*buf));
nextp = RB_NEXT(pfi_ifhead, _ifs, p);
pfi_kif_unref(p, PFI_KIF_REF_RULE);
}
}
*size = n;
-   return (0);
 }
 
 int
Index: sys/net/pf_ioctl.c
===
RCS file: /cvs/src/sys/net/pf_ioctl.c,v
retrieving revision 1.379
diff -u -p -r1.379 pf_ioctl.c
--- sys/net/pf_ioctl.c  9 Apr 2022 13:15:44 -   1.379
+++ sys/net/pf_ioctl.c  28 Apr 2022 17:21:47 -
@@ -2921,18 +2921,30 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
break;
 
case DIOCIGETIFACES: {
-   struct pfioc_iface *io = (struct pfioc_iface *)addr;
+   struct pfioc_iface  *io = (struct pfioc_iface *)addr;
+   struct pfi_kif  *kif_buf;
+   int  apfiio_size = io->pfiio_size;
 
if (io->pfiio_esize != sizeof(struct pfi_kif)) {
error = ENODEV;
goto fail;
}
+
+   if ((kif_buf = mallocarray(sizeof(*kif_buf), apfiio_size,
+   M_TEMP, M_WAITOK|M_CANFAIL)) == NULL) {
+   error = EINVAL;
+   goto fail;
+   }
+
NET_LOCK();
PF_LOCK();
-   error = pfi_get_ifaces(io->pfiio_name, io->pfiio_buffer,
-   >pfiio_size);
+   pfi_get_ifaces(io->pfiio_name, kif_buf, >pfiio_size);
PF_UNLOCK();
NET_UNLOCK();
+   if (copyout(kif_buf, io->pfiio_buffer, sizeof(*kif_buf) *
+   io->pfiio_size))
+   error = EFAULT;
+   free(kif_buf, M_TEMP, sizeof(*kif_buf) * apfiio_size);
break;
}
 
Index: sys/net/pfvar.h
===
RCS file: /cvs/src/sys/net/pfvar.h,v
retrieving revision 1.506
diff -u -p -r1.506 pfvar.h
--- sys/net/pfvar.h 21 Apr 2022 15:22:49 -  1.506
+++ sys/net/pfvar.h 28 Apr 2022 16:22:44 -
@@ -1879,7 +1879,7 @@ intpfi_dynaddr_setup(struct pf_addr_w
 voidpfi_dynaddr_remove(struct pf_addr_wrap *);
 voidpfi_dynaddr_copyout(struct pf_addr_wrap *);
 voidpfi_update_status(const char *, struct pf_status *);
-int pfi_get_ifaces(const char *, struct pfi_kif *, int *);
+voidpfi_get_ifaces(const char *, struct pfi_kif *, int *);
 int pfi_set_flags(const char *, int);
 int pfi_clear_flags(const char *, int);
 voidpfi_xcommit(void);



Re: wall(1) - fix issues with line wrapping

2022-04-25 Thread Moritz Buhl
On Tue, Apr 19, 2022 at 09:00:08PM +0200, Anton Borowka wrote:
> Hi,
> 
> the attached diff fixes 2 issues with line wrappings in wall(1).
...

I can confirm that your patch fixes the described problems. But I
am wondering if it makes sense to split the ch == '\n' case and the
cnt == 79 case into two blocks.

OK mbuhl.



add sendmmsg and recvmmsg systemcalls

2022-04-22 Thread Moritz Buhl
l to
-.Fn recvfrom
-with a null
-.Fa from
-parameter.
-.Pp
-On successful completion, all three routines return the number of
-message bytes read.
-If a message is too long to fit in the supplied
-buffer, excess bytes may be discarded depending on the type of socket
-the message is received from (see
-.Xr socket 2 ) .
-.Pp
 If no messages are available at the socket, the
 receive call waits for a message to arrive, unless
 the socket is nonblocking (see
@@ -158,6 +159,8 @@ The
 .Dv MSG_CMSG_CLOEXEC
 requests that any file descriptors received as ancillary data with
 .Fn recvmsg
+and
+.Fn recvmmsg
 (see below)
 have their close-on-exec flag set.
 .Pp
@@ -249,13 +252,67 @@ Indicates that the packet was received a
 .It Dv MSG_MCAST
 Indicates that the packet was received as multicast.
 .El
+.Pp
+The
+.Fn recvmmsg
+call uses an array of the
+.Fa mmsghdr
+structure of length
+.Fa vlen
+to group multiple
+.Fa msghdr
+structures into a single system call.
+.Fa vlen
+is capped at maximum
+.Dv 1024
+messages that are received in a single call.
+The
+.Fa flags
+field allows setting
+.Dv MSG_WAITFORONE
+to wait for one
+.Fa msghdr ,
+and set
+.Dv MSG_DONTWAIT
+for all subsequent messages.
+A provided
+.Fa timeout
+limits the time spent in the function but it does not limit the
+time spent in lower parts of the kernel.
+.Pp
+The
+.Fa mmsghdr
+structure has the following form, as defined in
+.In sys/socket.h :
+.Bd -literal
+struct mmsghdr {
+   struct msghdr msg_hdr;
+   unsigned int msg_len;
+};
+.Ed
+.Pp
+Here
+.Fa msg_len
+indicated the number of bytes received for each
+.Fa msg_hdr
+member.
 .Sh RETURN VALUES
-These calls return the number of bytes received, or \-1 if an error occurred.
+The
+.Fn send ,
+.Fn sendto ,
+and
+.Fn sendmsg
+calls return the number of bytes received, or \-1 if an error occurred.
+The
+.Fn sendmmsg
+call returns the number of messages received, or \-1
+if an error occurred before the first message has been received. 
 .Sh ERRORS
 .Fn recv ,
 .Fn recvfrom ,
+.Fn recvmsg ,
 and
-.Fn recvmsg
+.Fn recvmmsg
 fail if:
 .Bl -tag -width "[EHOSTUNREACH]"
 .It Bq Er EBADF
@@ -310,6 +367,8 @@ was larger than
 .Pp
 And
 .Fn recvmsg
+and
+.Fn recvmmsg
 may return one of the following errors:
 .Bl -tag -width Er
 .It Bq Er EINVAL
@@ -364,6 +423,12 @@ The
 .Fn recv
 function call appeared in
 .Bx 4.1c .
+The
+.Fn sendmmsg
+syscall first appeared in Linux 2.6.33, was reimplemented for
+.Nx 7.0 ,
+and ported to
+.Ox 7.1 .
 .Sh CAVEATS
 Calling
 .Fn recvmsg
Index: lib/libc/sys/send.2
===
RCS file: /cvs/src/lib/libc/sys/send.2,v
retrieving revision 1.34
diff -u -p -r1.34 send.2
--- lib/libc/sys/send.2 11 Jan 2019 06:10:13 -  1.34
+++ lib/libc/sys/send.2 22 Apr 2022 16:03:30 -
@@ -36,7 +36,8 @@
 .Sh NAME
 .Nm send ,
 .Nm sendto ,
-.Nm sendmsg
+.Nm sendmsg ,
+.Nm sendmmsg
 .Nd send a message from a socket
 .Sh SYNOPSIS
 .In sys/socket.h
@@ -46,19 +47,23 @@
 .Fn sendto "int s" "const void *msg" "size_t len" "int flags" "const struct 
sockaddr *to" "socklen_t tolen"
 .Ft ssize_t
 .Fn sendmsg "int s" "const struct msghdr *msg" "int flags"
+.Ft int
+.Fn sendmmsg "int s" "const struct mmsghdr *mmsg" "unsigned int vlen" 
"unsigned int flags"
 .Sh DESCRIPTION
 .Fn send ,
 .Fn sendto ,
+.Fn sendmsg ,
 and
-.Fn sendmsg
+.Fn sendmmsg
 are used to transmit a message to another socket.
 .Fn send
 may be used only when the socket is in a
 .Em connected
 state, while
-.Fn sendto
+.Fn sendto ,
+.Fn sendmsg ,
 and
-.Fn sendmsg
+.Fn sendmmsg
 may be used at any time.
 .Pp
 The address of the target is given by
@@ -127,10 +132,21 @@ See
 .Xr recv 2
 for a description of the
 .Fa msghdr
-structure.
+and
+.Fa mmsghdr
+structures.
 .Sh RETURN VALUES
-The call returns the number of characters sent, or \-1
+The
+.Fn send ,
+.Fn sendto ,
+and
+.Fn sendmsg
+calls return the number of characters sent, or \-1
 if an error occurred.
+The
+.Fn sendmmsg
+call returns the number of messages sent, or \-1
+if an error occurred before the first message has been sent. 
 .Sh ERRORS
 .Fn send ,
 .Fn sendto ,
@@ -267,3 +283,9 @@ The
 .Fn send
 function call appeared in
 .Bx 4.1c .
+The
+.Fn sendmmsg
+syscall first appeared in Linux 3.0, was reimplemented for
+.Nx 7.0 ,
+and ported to
+.Ox 7.1 .
Index: lib/libc/sys/w_recvmmsg.c
===
RCS file: lib/libc/sys/w_recvmmsg.c
diff -N lib/libc/sys/w_recvmmsg.c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ lib/libc/sys/w_recvmmsg.c   22 Apr 2022 16:03:30 -
@@ -0,0 +1,32 @@
+/* $OpenBSD$ */
+/*
+ * Copyright (c) 2021 Moritz Buhl 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permissio

pf_rollback_rules always returns 0

2022-04-07 Thread Moritz Buhl
Hi,

Since 2015 pf_rollback_rules always returns 0. Make it void.

OK?
mbuhl


Index: sys/net/pf_ioctl.c
===
RCS file: /cvs/src/sys/net/pf_ioctl.c,v
retrieving revision 1.376
diff -u -p -r1.376 pf_ioctl.c
--- sys/net/pf_ioctl.c  4 Apr 2022 12:57:36 -   1.376
+++ sys/net/pf_ioctl.c  7 Apr 2022 13:14:28 -
@@ -93,7 +93,7 @@ intpfopen(dev_t, int, int, struct pr
 int pfclose(dev_t, int, int, struct proc *);
 int pfioctl(dev_t, u_long, caddr_t, int, struct proc *);
 int pf_begin_rules(u_int32_t *, const char *);
-int pf_rollback_rules(u_int32_t, char *);
+voidpf_rollback_rules(u_int32_t, char *);
 voidpf_remove_queues(void);
 int pf_commit_queues(void);
 voidpf_free_queues(struct pf_queuehead *);
@@ -537,7 +537,7 @@ pf_begin_rules(u_int32_t *ticket, const 
return (0);
 }
 
-int
+void
 pf_rollback_rules(u_int32_t ticket, char *anchor)
 {
struct pf_ruleset   *rs;
@@ -546,7 +546,7 @@ pf_rollback_rules(u_int32_t ticket, char
rs = pf_find_ruleset(anchor);
if (rs == NULL || !rs->rules.inactive.open ||
rs->rules.inactive.ticket != ticket)
-   return (0);
+   return;
while ((rule = TAILQ_FIRST(rs->rules.inactive.ptr)) != NULL) {
pf_rm_rule(rs->rules.inactive.ptr, rule);
rs->rules.inactive.rcount--;
@@ -555,11 +555,9 @@ pf_rollback_rules(u_int32_t ticket, char
 
/* queue defs only in the main ruleset */
if (anchor[0])
-   return (0);
+   return;
 
pf_free_queues(pf_queues_inactive);
-
-   return (0);
 }
 
 void
@@ -2597,14 +2595,7 @@ pfioctl(dev_t dev, u_long cmd, caddr_t a
}
break;
case PF_TRANS_RULESET:
-   if ((error = pf_rollback_rules(ioe->ticket,
-   ioe->anchor))) {
-   PF_UNLOCK();
-   NET_UNLOCK();
-   free(table, M_TEMP, sizeof(*table));
-   free(ioe, M_TEMP, sizeof(*ioe));
-   goto fail; /* really bad */
-   }
+   pf_rollback_rules(ioe->ticket, ioe->anchor);
break;
default:
PF_UNLOCK();



Re: possible memory leak in ipmi get_sdr

2022-04-07 Thread Moritz Buhl
Any insights?

On Mon, Jan 10, 2022 at 03:18:47PM +0100, Moritz Buhl wrote:
> Hi tech@,
> 
> The return value of add_child_sensors is returned in add_sdr_sensor,
> which is called by get_sdr.  get_sdr mallocs psdr and only frees it
> if add_sdr_sensor returns 0.  The assumption is that psdr is attached
> to a list in add_child_sensors otherwise.  This is not the case if
> the malloc for psensor fails, or read_sensor() equals 0.  In any
> of these error cases psdr is leaked.
> 
> Kind regards
> mbuhl
> 
> Index: sys/dev/ipmi.c
> ===
> RCS file: /mount/openbsd/cvs/src/sys/dev/ipmi.c,v
> retrieving revision 1.115
> diff -u -p -r1.115 ipmi.c
> --- sys/dev/ipmi.c23 Jan 2021 12:10:08 -  1.115
> +++ sys/dev/ipmi.c10 Jan 2022 13:49:19 -
> @@ -1374,7 +1374,7 @@ add_child_sensors(struct ipmi_softc *sc,
>  int sensor_num, int sensor_type, int ext_type, int sensor_base,
>  int entity, const char *name)
>  {
> - int typ, idx;
> + int typ, idx, rc = 0;
>   struct ipmi_sensor  *psensor;
>  #ifdef IPMI_DEBUG
>   struct sdrtype1 *s1 = (struct sdrtype1 *)psdr;
> @@ -1415,11 +1415,12 @@ add_child_sensors(struct ipmi_softc *sc,
>   dbg_printf(5, "  reading: %lld [%s]\n",
>   psensor->i_sensor.value,
>   psensor->i_sensor.desc);
> + rc = 1;
>   } else
>   free(psensor, M_DEVBUF, sizeof(*psensor));
>   }
>  
> - return (1);
> + return (rc);
>  }
>  
>  /* Handle IPMI Timer - reread sensor values */
> 



Re: possible memory leak in ipmi get_sdr

2022-01-14 Thread Moritz Buhl
I thought I'd walk through my understanding of the code
to make it easier to review.

from sys/dev/ipmi.c:

1019 int
1020 get_sdr(struct ipmi_softc *sc, u_int16_t recid, u_int16_t *nxtrec)
1021 {
...
1046 psdr = malloc(sdrlen, M_DEVBUF, M_NOWAIT);
1047 if (psdr == NULL)
1048 return (1);
Now psdr is allocated. In the following loop it is freed correctly on
failure:
1062 free(psdr, M_DEVBUF, sdrlen);
1063 return (1);
If everything works, we attach it to a list:
1067 /* Add SDR to sensor list, if not wanted, free buffer */
1068 if (add_sdr_sensor(sc, psdr, sdrlen) == 0)
1069 free(psdr, M_DEVBUF, sdrlen);
1070
1071 return (0);
So if add_sdr_sensor doesn't return 0 it has to keep a reference
to psdr somewhere.

1335 int
1336 add_sdr_sensor(struct ipmi_softc *sc, u_int8_t *psdr, int sdrlen)
1337 {
There are two switch cases that both call add_child_sensors and set
rc with it, if no previous failures occour:
1349 rc = add_child_sensors(sc, psdr, 1, s1->sensor_num,
1350 s1->sensor_type, s1->event_code, 0, s1->entity_id, 
name);
1351 break;
and
1358 rc = add_child_sensors(sc, psdr, s2->share1 & 0xF,
1359 s2->sensor_num, s2->sensor_type, s2->event_code,
1360 s2->share2 & 0x7F, s2->entity_id, name);
1361 break;
These two calls are the only locations that can make add_sdr_sensor
return something else than 0.

1370 int
1371 add_child_sensors(struct ipmi_softc *sc, u_int8_t *psdr, int count,
1372 int sensor_num, int sensor_type, int ext_type, int sensor_base,
1373 int entity, const char *name)
1374 {
So as mentioned before, this function needs to return 0 or keep a
reference to  psdr.
This is not the case in two situations, but to keep it simple I
will only show one:
1388 psensor = malloc(sizeof(*psensor), M_DEVBUF, M_NOWAIT | 
M_ZERO);
1389 if (psensor == NULL)
1390 break;
...
1420 return (1);

I hope this helps
mbuhl



uninitialized stack memory possibly passed to m_freem

2022-01-12 Thread Moritz Buhl
Hi tech@,

https://github.com/openbsd/src/commit/0ea6bae06233cd25645df14602c3eda6bdff7dca.patch

the patch forgot to add mrep to the info struct, nfsm_dissect could
pass info.nmi_mrep to m_freem, which is currently uninitialized
stack memory.

Index: sys/nfs/nfs_subs.c
===
RCS file: /mount/openbsd/cvs/src/sys/nfs/nfs_subs.c,v
retrieving revision 1.145
diff -u -p -r1.145 nfs_subs.c
--- sys/nfs/nfs_subs.c  11 Jan 2022 03:13:59 -  1.145
+++ sys/nfs/nfs_subs.c  12 Jan 2022 09:31:52 -
@@ -1841,6 +1841,7 @@ nfsm_srvsattr(struct mbuf **mp, struct v
 
info.nmi_md = *mp;
info.nmi_dpos = *dposp;
+   info.nmi_mrep = mrep;
 
nfsm_dissect(tl, u_int32_t *, NFSX_UNSIGNED);
if (*tl == nfs_true) {



possible memory leak in ipmi get_sdr

2022-01-10 Thread Moritz Buhl
Hi tech@,

The return value of add_child_sensors is returned in add_sdr_sensor,
which is called by get_sdr.  get_sdr mallocs psdr and only frees it
if add_sdr_sensor returns 0.  The assumption is that psdr is attached
to a list in add_child_sensors otherwise.  This is not the case if
the malloc for psensor fails, or read_sensor() equals 0.  In any
of these error cases psdr is leaked.

Kind regards
mbuhl

Index: sys/dev/ipmi.c
===
RCS file: /mount/openbsd/cvs/src/sys/dev/ipmi.c,v
retrieving revision 1.115
diff -u -p -r1.115 ipmi.c
--- sys/dev/ipmi.c  23 Jan 2021 12:10:08 -  1.115
+++ sys/dev/ipmi.c  10 Jan 2022 13:49:19 -
@@ -1374,7 +1374,7 @@ add_child_sensors(struct ipmi_softc *sc,
 int sensor_num, int sensor_type, int ext_type, int sensor_base,
 int entity, const char *name)
 {
-   int typ, idx;
+   int typ, idx, rc = 0;
struct ipmi_sensor  *psensor;
 #ifdef IPMI_DEBUG
struct sdrtype1 *s1 = (struct sdrtype1 *)psdr;
@@ -1415,11 +1415,12 @@ add_child_sensors(struct ipmi_softc *sc,
dbg_printf(5, "  reading: %lld [%s]\n",
psensor->i_sensor.value,
psensor->i_sensor.desc);
+   rc = 1;
} else
free(psensor, M_DEVBUF, sizeof(*psensor));
}
 
-   return (1);
+   return (rc);
 }
 
 /* Handle IPMI Timer - reread sensor values */



bsd.regress.mk REGRESS_SETUP vs REGRESS_SETUP_ONCE

2020-09-24 Thread Moritz Buhl
Dear tech@,

given the following Makefile:

# REGRESS_SETUP vs REGRESS_SETUP_ONCE example

setup:
@echo setup

setup_once:
@echo setup_once

test1:
@echo test1

test2:
@echo test2

cleanup:
@echo cleanup

REGRESS_SETUP = setup
REGRESS_SETUP_ONCE =setup_once
REGRESS_TARGETS+=   test1 test2
REGRESS_CLEANUP =   cleanup

.include 

I expected the following output:
pbp$ make regress
setup_once
setup
test1
setup
test2
cleanup

But instead I got:
pbp$ make regress
setup
setup_once
test1
setup
test2
cleanup

The diff below yields the expected result instead.

Index: share/mk/bsd.regress.mk
===
RCS file: /cvs/src/share/mk/bsd.regress.mk,v
retrieving revision 1.21
diff -u -p -r1.21 bsd.regress.mk
--- share/mk/bsd.regress.mk 17 Jun 2019 17:20:24 -  1.21
+++ share/mk/bsd.regress.mk 24 Sep 2020 19:48:11 -
@@ -67,16 +67,16 @@ REGRESS_SETUP?=
 REGRESS_SETUP_ONCE?=
 REGRESS_CLEANUP?=
 
-.if !empty(REGRESS_SETUP)
-${REGRESS_TARGETS}: ${REGRESS_SETUP}
-.endif
-
 .if !empty(REGRESS_SETUP_ONCE)
 CLEANFILES+=${REGRESS_SETUP_ONCE:S/^/stamp-/}
 ${REGRESS_TARGETS}: ${REGRESS_SETUP_ONCE:S/^/stamp-/}
 ${REGRESS_SETUP_ONCE:S/^/stamp-/}: .SILENT
${MAKE} -C ${.CURDIR} ${@:S/^stamp-//}
date >$@
+.endif
+
+.if !empty(REGRESS_SETUP)
+${REGRESS_TARGETS}: ${REGRESS_SETUP}
 .endif
 
 regress: .SILENT



Re: vfs posix compatibility change

2019-09-08 Thread Moritz Buhl
Any comments?

July 22, 2019 8:23 PM, "Moritz Buhl"  wrote:

> Hi,
> 
> here is another POSIX compatibility change I found while running
> NetBSD system call tests:
> 
> The incompatibilities can be reproduced with the followding comands:
> `mv / foo`
> mv: rename / to foo: Is a directory
> `man 2 rename` mentions EISDIR:
> [EISDIR] to is a directory, but from is not a directory.
> Which is mentioned similarly in POSIX. But it doesn't fit the current case.
> 
> `mkdir /`
> mkdir: /: Is a directory
> EISDIR is not mentioned in `man 2 mkdir`. Also not mentioned in POSIX.
> I believe EBUSY would be good here.
> 
> `rmdir /`
> rmdir: /: Is a directory
> EISDIR is not mentioned in `man 2 rmdir`. Also not mentioned in POSIX.
> 
> unlink("/");
> printf("%d\n", errno);
> 21 // EISDIR
> Also neither mentioned in man nor POSIX.
> 
> Here is some POSIX references:
> https://pubs.opengroup.org/onlinepubs/9699919799
> unlink (2):
> [EBUSY]
> The file named by the path argument cannot be unlinked because it is
> being used by the system or another process and the implementation
> considers this an error.
> 
> rename (2):
> [EBUSY]
> The directory named by old or new is currently in use by the system or
> another process, and the implementation considers this an error.
> 
> rmdir (2):
> [EBUSY]
> The directory to be removed is currently in use by the system or
> some process and the implementation considers this to be an error.
> 
> mkdir (2):
> [EEXIST]
> The named file exists.
> 
> Attached is a patch that will make mkdir return EEXIST, EBUSY for others.
> I believe this corer case is quite essential as it is in namei because of
> this (and because NetBSD and FreeBSD differ a lot in the vfs area) I hope
> for some comments on this.
> 
> The patched code hasn't changed a lot since 44BSD. duh POSIX.
> There already were discussions on EISDIR in the past: (from unlink POSIX)
> 
>> The standard developers reviewed TR 24715-2006 and noted that LSB-conforming 
>> implementations may
>> return [EISDIR] instead of [EPERM] when unlinking a directory. A change to 
>> permit this behavior by
>> changing the requirement for [EPERM] to [EPERM] or [EISDIR] was considered, 
>> but decided against
>> since it would break existing strictly conforming and conforming 
>> applications. Applications written
>> for portability to both POSIX.1-2017 and the LSB should be prepared to 
>> handle either error code.
> 
> duh.
> 
> thanks,
> mbuhl
> 
> Index: sys/kern/vfs_lookup.c
> ===
> RCS file: /cvs/src/sys/kern/vfs_lookup.c,v
> retrieving revision 1.80
> diff -u -p -r1.80 vfs_lookup.c
> --- sys/kern/vfs_lookup.c 18 Jul 2019 18:06:17 - 1.80
> +++ sys/kern/vfs_lookup.c 22 Jul 2019 16:10:14 -
> @@ -440,7 +440,10 @@ vfs_lookup(struct nameidata *ndp)
> */
> if (cnp->cn_nameptr[0] == '\0') {
> if (ndp->ni_dvp == NULL && wantparent) {
> - error = EISDIR;
> + if (cnp->cn_nameiop == CREATE)
> + error = EEXIST;
> + else
> + error = EBUSY;
> goto bad;
> }
> ndp->ni_vp = dp;
> Index: lib/libc/sys/rename.2
> ===
> RCS file: /cvs/src/lib/libc/sys/rename.2,v
> retrieving revision 1.22
> diff -u -p -r1.22 rename.2
> --- lib/libc/sys/rename.2 10 Sep 2015 17:55:21 - 1.22
> +++ lib/libc/sys/rename.2 22 Jul 2019 16:45:16 -
> @@ -123,6 +123,12 @@ characters, or an entire pathname (inclu
> exceeded
> .Dv PATH_MAX
> bytes.
> +.It Bq Er EBUSY
> +The directory containing
> +.Fa from
> +or
> +.Fa to
> +is currently in use by the system.
> .It Bq Er ENOENT
> A component of the
> .Fa from



fix ssh unittest misc regression test

2019-08-22 Thread Moritz Buhl
Hi,

some ssh unittest ist failing because it links stuff together in another way.
basically add 'sshbuf-misc.c' to SRCS.
additional suggestion: make variable assignments similar to ports

mbuhl

Index: regress/usr.bin/ssh/unittests/misc/Makefile
===
RCS file: /mount/openbsd/cvs/src/regress/usr.bin/ssh/unittests/misc/Makefile,v
retrieving revision 1.1
diff -u -p -r1.1 Makefile
--- regress/usr.bin/ssh/unittests/misc/Makefile 28 Apr 2019 22:53:26 -  
1.1
+++ regress/usr.bin/ssh/unittests/misc/Makefile 22 Aug 2019 20:06:46 -
@@ -1,14 +1,15 @@
 #  $OpenBSD: Makefile,v 1.1 2019/04/28 22:53:26 dtucker Exp $
 
-PROG=test_misc
-SRCS=tests.c
+PROG = test_misc
+SRCS = tests.c
 
 # From usr.bin/ssh/Makefile.inc
-SRCS+=sshbuf.c sshbuf-getput-basic.c ssherr.c log.c xmalloc.c misc.c
+SRCS +=sshbuf.c sshbuf-getput-basic.c sshbuf-misc.c ssherr.c log.c\
+   xmalloc.c misc.c
 # From usr/bin/ssh/sshd/Makefile
-SRCS+=atomicio.c cleanup.c fatal.c
+SRCS +=atomicio.c cleanup.c fatal.c
 
-REGRESS_TARGETS=run-regress-${PROG}
+REGRESS_TARGETS =  run-regress-${PROG}
 
 run-regress-${PROG}: ${PROG}
env ${TEST_ENV} ./${PROG}



fix rpki client regression tests

2019-08-22 Thread Moritz Buhl
Hi,

some experts forgot to run the rpki-client regression tests after some
changes.  The tests are relinking parts of the source and this is not great
but after all I still prefer some running tests.  Patch attached.

thanks,
mbuhl

Index: regress/usr.sbin/rpki-client/test-cert.c
===
RCS file: /mount/openbsd/cvs/src/regress/usr.sbin/rpki-client/test-cert.c,v
retrieving revision 1.2
diff -u -p -r1.2 test-cert.c
--- regress/usr.sbin/rpki-client/test-cert.c12 Aug 2019 18:03:17 -  
1.2
+++ regress/usr.sbin/rpki-client/test-cert.c22 Aug 2019 19:38:14 -
@@ -31,6 +31,7 @@
 #include 
 
 #include "extern.h"
+int verbose = 0;
 
 static void
 cert_print(const struct cert *p)
Index: regress/usr.sbin/rpki-client/test-ip.c
===
RCS file: /mount/openbsd/cvs/src/regress/usr.sbin/rpki-client/test-ip.c,v
retrieving revision 1.3
diff -u -p -r1.3 test-ip.c
--- regress/usr.sbin/rpki-client/test-ip.c  12 Aug 2019 18:03:17 -  
1.3
+++ regress/usr.sbin/rpki-client/test-ip.c  22 Aug 2019 19:38:19 -
@@ -30,6 +30,7 @@
 #include 
 
 #include "extern.h"
+int verbose = 0;
 
 static void
 test(const char *res, uint16_t afiv, size_t sz, size_t unused, ...)
Index: regress/usr.sbin/rpki-client/test-mft.c
===
RCS file: /mount/openbsd/cvs/src/regress/usr.sbin/rpki-client/test-mft.c,v
retrieving revision 1.2
diff -u -p -r1.2 test-mft.c
--- regress/usr.sbin/rpki-client/test-mft.c 12 Aug 2019 18:03:17 -  
1.2
+++ regress/usr.sbin/rpki-client/test-mft.c 22 Aug 2019 19:38:22 -
@@ -28,6 +28,7 @@
 #include 
 
 #include "extern.h"
+int verbose = 0;
 
 static void
 mft_print(const struct mft *p)
Index: regress/usr.sbin/rpki-client/test-roa.c
===
RCS file: /mount/openbsd/cvs/src/regress/usr.sbin/rpki-client/test-roa.c,v
retrieving revision 1.2
diff -u -p -r1.2 test-roa.c
--- regress/usr.sbin/rpki-client/test-roa.c 12 Aug 2019 18:03:17 -  
1.2
+++ regress/usr.sbin/rpki-client/test-roa.c 22 Aug 2019 19:38:25 -
@@ -28,6 +28,7 @@
 #include 
 
 #include "extern.h"
+int verbose = 0;
 
 static void
 roa_print(const struct roa *p)
Index: regress/usr.sbin/rpki-client/test-tal.c
===
RCS file: /mount/openbsd/cvs/src/regress/usr.sbin/rpki-client/test-tal.c,v
retrieving revision 1.2
diff -u -p -r1.2 test-tal.c
--- regress/usr.sbin/rpki-client/test-tal.c 12 Aug 2019 18:03:17 -  
1.2
+++ regress/usr.sbin/rpki-client/test-tal.c 22 Aug 2019 19:38:28 -
@@ -28,6 +28,7 @@
 #include 
 
 #include "extern.h"
+int verbose = 0;
 
 static void
 tal_print(const struct tal *p)



unmount after realpath

2019-08-02 Thread Moritz Buhl
Hi,

the same as to unveil(2) also counts for realpath(3).
https://marc.info/?l=openbsd-bugs=156469573812165=2
Similar diff attached.

tests are here:
https://github.com/moritzbuhl/realpath-unmount-regress

thanks,
mbuhl

Index: sys/kern/vfs_syscalls.c
===
RCS file: /cvs/src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.327
diff -u -p -r1.327 vfs_syscalls.c
--- sys/kern/vfs_syscalls.c 25 Jul 2019 01:43:21 -  1.327
+++ sys/kern/vfs_syscalls.c 2 Aug 2019 12:45:12 -
@@ -948,10 +948,10 @@ sys___realpath(struct proc *p, void *v, 
VOP_UNLOCK(nd.ni_vp);
vrele(nd.ni_vp);
}
-   if (nd.ni_dvp && nd.ni_dvp != nd.ni_vp){
+   if (nd.ni_dvp && nd.ni_dvp != nd.ni_vp)
VOP_UNLOCK(nd.ni_dvp);
+   if (nd.ni_dvp)
vrele(nd.ni_dvp);
-   }
 
error = copyoutstr(nd.ni_cnd.cn_rpbuf, SCARG(uap, resolved),
MAXPATHLEN, NULL);



vfs posix compatibility change

2019-07-22 Thread Moritz Buhl
Hi,

here is another POSIX compatibility change I found while running
NetBSD system call tests:

The incompatibilities can be reproduced with the followding comands:
`mv / foo`
mv: rename / to foo: Is a directory
`man 2 rename` mentions EISDIR:
[EISDIR]   to is a directory, but from is not a directory.
Which is mentioned similarly in POSIX. But it doesn't fit the current case.

`mkdir /`
mkdir: /: Is a directory
EISDIR is not mentioned in `man 2 mkdir`. Also not mentioned in POSIX.
I believe EBUSY would be good here.

`rmdir /`
rmdir: /: Is a directory
EISDIR is not mentioned in `man 2 rmdir`. Also not mentioned in POSIX.

unlink("/");
printf("%d\n", errno);
21 // EISDIR
Also neither mentioned in man nor POSIX.

Here is some POSIX references:
https://pubs.opengroup.org/onlinepubs/9699919799/
unlink (2):
[EBUSY]
The file named by the path argument cannot be unlinked because it is
being used by the system or another process and the implementation
considers this an error.

rename (2):
[EBUSY]
The directory named by old or new is currently in use by the system or
another process, and the implementation considers this an error.

rmdir (2):
[EBUSY]
The directory to be removed is currently in use by the system or
some process and the implementation considers this to be an error.

mkdir (2):
[EEXIST]
The named file exists.

Attached is a patch that will make mkdir return EEXIST, EBUSY for others.
I believe this corer case is quite essential as it is in namei because of
this (and because NetBSD and FreeBSD differ a lot in the vfs area) I hope
for some comments on this.

The patched code hasn't changed a lot since 44BSD. duh POSIX.
There already were discussions on EISDIR in the past: (from unlink POSIX)

> The standard developers reviewed TR 24715-2006 and noted that LSB-conforming 
> implementations may return [EISDIR] instead of [EPERM] when unlinking a 
> directory. A change to permit this behavior by changing the requirement for 
> [EPERM] to [EPERM] or [EISDIR] was considered, but decided against since it 
> would break existing strictly conforming and conforming applications. 
> Applications written for portability to both POSIX.1-2017 and the LSB should 
> be prepared to handle either error code.

duh.

thanks,
mbuhl

Index: sys/kern/vfs_lookup.c
===
RCS file: /cvs/src/sys/kern/vfs_lookup.c,v
retrieving revision 1.80
diff -u -p -r1.80 vfs_lookup.c
--- sys/kern/vfs_lookup.c   18 Jul 2019 18:06:17 -  1.80
+++ sys/kern/vfs_lookup.c   22 Jul 2019 16:10:14 -
@@ -440,7 +440,10 @@ vfs_lookup(struct nameidata *ndp)
 */
if (cnp->cn_nameptr[0] == '\0') {
if (ndp->ni_dvp == NULL && wantparent) {
-   error = EISDIR;
+   if (cnp->cn_nameiop == CREATE)
+   error = EEXIST;
+   else
+   error = EBUSY;
goto bad;
}
ndp->ni_vp = dp;
Index: lib/libc/sys/rename.2
===
RCS file: /cvs/src/lib/libc/sys/rename.2,v
retrieving revision 1.22
diff -u -p -r1.22 rename.2
--- lib/libc/sys/rename.2   10 Sep 2015 17:55:21 -  1.22
+++ lib/libc/sys/rename.2   22 Jul 2019 16:45:16 -
@@ -123,6 +123,12 @@ characters, or an entire pathname (inclu
 exceeded
 .Dv PATH_MAX
 bytes.
+.It Bq Er EBUSY
+The directory containing
+.Fa from
+or
+.Fa to
+is currently in use by the system.
 .It Bq Er ENOENT
 A component of the
 .Fa from



Re: make msgsnd(2) more posix

2019-07-18 Thread Moritz Buhl
> Moritz, can you create a man page ERRORS diff?

Is the attached diff ok?

mbuhl

Index: lib/libc/sys/msgsnd.2
===
RCS file: /cvs/src/lib/libc/sys/msgsnd.2,v
retrieving revision 1.19
diff -u -p -r1.19 msgsnd.2
--- lib/libc/sys/msgsnd.2   10 Dec 2014 19:19:00 -  1.19
+++ lib/libc/sys/msgsnd.2   18 Jul 2019 11:21:48 -
@@ -127,6 +127,9 @@ will fail if:
 .Fa msqid
 is not a valid message queue identifier.
 .Pp
+.Va mtype
+is less than 1.
+.Pp
 .Fa msgsz
 is greater than
 .Va msg_qbytes .



make msgsnd(2) more posix

2019-07-12 Thread Moritz Buhl
Hi,

while running some ported tests I noticed that msgsnd(2) did not error
when passing a message with mtype < 1, even tho the manual page states:
> mtype is an integer greater than 0 that can be used

POSIX says:
https://pubs.opengroup.org/onlinepubs/9699919799/
[EINVAL]
The value of msqid is not a valid message queue identifier, or the value of 
mtype is less than 1; or the value of msgsz is greater than the system-imposed 
limit.

Here is a minimal test:

#include 

struct msg {
long mtype;
char buf[3];
};

int
main(void)
{
struct msg msg = { 0, { 'a', 'b', 'c' } };
int id;

id = msgget(1234, IPC_CREAT | 0600);

// man 2 msgsnd: mtype is an integer greater than 0 ...
if (msgsnd(id, , sizeof(struct msg), IPC_NOWAIT) == -1)
return 0;
return 1;
}

and the patch follows.

greetings,
mbuhl


Index: sys/kern/sysv_msg.c
===
RCS file: /cvs/src/sys/kern/sysv_msg.c,v
retrieving revision 1.34
diff -u -p -r1.34 sysv_msg.c
--- sys/kern/sysv_msg.c 5 Dec 2018 15:42:45 -   1.34
+++ sys/kern/sysv_msg.c 12 Jul 2019 15:03:16 -
@@ -588,7 +588,7 @@ msg_copyin(struct msg *msg, const char *
return (error);
}
 
-   if (msg->msg_type < 0) {
+   if (msg->msg_type <= 0) {
msg_free(msg);
return (EINVAL);
}



Re: problems with libm

2019-07-11 Thread Moritz Buhl
Hi,

I made the FreeBSD msun regression tests compile on OpenBSD.
https://github.com/moritzbuhl/msun-regress

3 out of 19 test files pass. 14 files die after the first error case.
Two files (ctrig_test.c and trig_test.c) use atf and after some hacks
they report all error cases. 840 for ctrig_test and 88 for trig_test.

These test files should be reviewed carefully as I know for sure that many
don't work on i386 (adding some volatile keywords usually helps).

I believe all these errors paint a good picture. I will be looking into
fixing what I can.



listen(2) sets wrong errno

2019-07-09 Thread Moritz Buhl
Hi,

while running some NetBSD syscall tests I noticed that listen(2) returns
the wrong errno when a socket is already connected. man 2 listen already
mentions EINVAL.

NetBSD bug:
http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=46150
NetBSD patch:
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/kern/uipc_socket.c.diff?r1=1.209=1.210_with_tag=MAIN
Spec:
https://pubs.opengroup.org/onlinepubs/9699919799/

patch attached,
mbuhl

Index: sys/kern/uipc_socket.c
===
RCS file: /mount/openbsd/cvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.232
diff -u -p -r1.232 uipc_socket.c
--- sys/kern/uipc_socket.c  4 Jul 2019 17:42:17 -   1.232
+++ sys/kern/uipc_socket.c  9 Jul 2019 16:31:50 -
@@ -172,7 +172,7 @@ solisten(struct socket *so, int backlog)
int s, error;
 
if (so->so_state & (SS_ISCONNECTED|SS_ISCONNECTING|SS_ISDISCONNECTING))
-   return (EOPNOTSUPP);
+   return (EINVAL);
 #ifdef SOCKET_SPLICE
if (isspliced(so) || issplicedback(so))
return (EOPNOTSUPP);



getgroups(2) with negative values

2019-07-08 Thread Moritz Buhl
Hi,

while porting some NetBSD syscall tests to OpenBSD I noticed that the
getgroups test is failing. Simply put:

gid_t gidset[NGROUPS_MAX];
getgroups(-1, gidset);

This was fixed on NetBSD 8 years ago:
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/kern/kern_prot.c
>   if (SCARG(uap, gidsetsize) < (int)*retval)
>   return EINVAL;

While here, also remove the u_int in setgroups. POSIX does't say a lot
about setgroups and therefore return EINVAL.
https://pubs.opengroup.org/onlinepubs/9699919799/

Index: sys/kern/kern_prot.c
===
RCS file: /cvs/src/sys/kern/kern_prot.c,v
retrieving revision 1.75
diff -u -p -r1.75 kern_prot.c
--- sys/kern/kern_prot.c22 Jun 2018 13:33:30 -  1.75
+++ sys/kern/kern_prot.c8 Jul 2019 12:13:04 -
@@ -196,7 +196,7 @@ sys_getgroups(struct proc *p, void *v, r
syscallarg(gid_t *) gidset;
} */ *uap = v;
struct ucred *uc = p->p_ucred;
-   u_int ngrp;
+   int ngrp;
int error;
 
if ((ngrp = SCARG(uap, gidsetsize)) == 0) {
@@ -870,13 +870,13 @@ sys_setgroups(struct proc *p, void *v, r
struct process *pr = p->p_p;
struct ucred *pruc, *newcred;
gid_t groups[NGROUPS_MAX];
-   u_int ngrp;
+   int ngrp;
int error;
 
if ((error = suser(p)) != 0)
return (error);
ngrp = SCARG(uap, gidsetsize);
-   if (ngrp > NGROUPS_MAX)
+   if (ngrp > NGROUPS_MAX || ngrp < 0)
return (EINVAL);
error = copyin(SCARG(uap, gidset), groups, ngrp * sizeof(gid_t));
if (error == 0) {



fix format strings in relayd

2019-07-04 Thread Moritz Buhl
Hi,

as promised a patch to relayd to enable formatting warnings for yyerror.

Greetings,
Moritz Buhl

Index: usr.sbin/relayd/parse.y
===
RCS file: /cvs/src/usr.sbin/relayd/parse.y,v
retrieving revision 1.238
diff -u -p -r1.238 parse.y
--- usr.sbin/relayd/parse.y 31 May 2019 15:25:57 -  1.238
+++ usr.sbin/relayd/parse.y 4 Jul 2019 14:59:36 -
@@ -75,7 +75,9 @@ intpopfile(void);
 int check_file_secrecy(int, const char *);
 int yyparse(void);
 int yylex(void);
-int yyerror(const char *, ...);
+int yyerror(const char *, ...)
+__attribute__((__format__ (printf, 1, 2)))
+__attribute__((__nonnull__ (1)));
 int kw_cmp(const void *, const void *);
 int lookup(char *);
 int igetc(void);
@@ -350,7 +352,7 @@ port: PORT HTTP {
}
| PORT NUMBER {
if ($2 <= 0 || $2 > (int)USHRT_MAX) {
-   yyerror("invalid port: %d", $2);
+   yyerror("invalid port: %lld", $2);
YYERROR;
}
$$.val[0] = htons($2);
@@ -389,7 +391,7 @@ sendbuf : NOTHING   {
 
 main   : INTERVAL NUMBER   {
if ((conf->sc_conf.interval.tv_sec = $2) < 0) {
-   yyerror("invalid interval: %d", $2);
+   yyerror("invalid interval: %lld", $2);
YYERROR;
}
}
@@ -403,7 +405,7 @@ main: INTERVAL NUMBER   {
| PREFORK NUMBER{
if ($2 <= 0 || $2 > PROC_MAX_INSTANCES) {
yyerror("invalid number of preforked "
-   "relays: %d", $2);
+   "relays: %lld", $2);
YYERROR;
}
conf->sc_conf.prefork_relay = $2;
@@ -895,7 +897,7 @@ tablecheck  : ICMP  { table->conf.check 
}
table->conf.check = CHECK_HTTP_CODE;
if ((table->conf.retcode = $5) <= 0) {
-   yyerror("invalid HTTP code: %d", $5);
+   yyerror("invalid HTTP code: %lld", $5);
free($2);
free($3);
YYERROR;
@@ -1094,7 +1096,7 @@ httpflags_l   : httpflags comma httpflags_
 
 httpflags  : HEADERLEN NUMBER  {
if ($2 < 0 || $2 > RELAY_MAXHEADERLENGTH) {
-   yyerror("invalid headerlen: %d", $2);
+   yyerror("invalid headerlen: %lld", $2);
YYERROR;
}
proto->httpheaderlen = $2;
@@ -1115,7 +1117,7 @@ tcpflags  : SACK  { proto->tcpflags |= T
| NO SPLICE { proto->tcpflags |= TCPFLAG_NSPLICE; }
| BACKLOG NUMBER{
if ($2 < 0 || $2 > RELAY_MAX_BACKLOG) {
-   yyerror("invalid backlog: %d", $2);
+   yyerror("invalid backlog: %lld", $2);
YYERROR;
}
proto->tcpbacklog = $2;
@@ -1123,13 +1125,13 @@ tcpflags: SACK  { 
proto->tcpflags |= T
| SOCKET BUFFER NUMBER  {
proto->tcpflags |= TCPFLAG_BUFSIZ;
if ((proto->tcpbufsiz = $3) < 0) {
-   yyerror("invalid socket buffer size: %d", $3);
+   yyerror("invalid socket buffer size: %lld", $3);
YYERROR;
}
}
| IP STRING NUMBER  {
if ($3 < 0) {
-   yyerror("invalid ttl: %d", $3);
+   yyerror("invalid ttl: %lld", $3);
free($2);
YYERROR;
}
@@ -2072,7 +2074,7 @@ routeoptsl: ROUTE addrprefix {
YYERROR;
}
if ($2 < 0 || $2 > RT_TABLEID_MAX) {
-   yyerror("invalid rtable id %d", $2);
+ 

fix pfctl regress on armv7

2019-07-04 Thread Moritz Buhl
Hi,

due to the pfctl regression tests failing on armv7 I noticed that the
yyerror function is having the usual format attributes. This lead to
garbage in error messages on armv7. I added the directives and then
adjusted the formatting to what the compiler suggested.

The same goes for relayd. A patch will follow.

Thanks for feedback and comments,
Moritz Buhl


Index: sbin/pfctl/parse.y
===
RCS file: /cvs/src/sbin/pfctl/parse.y,v
retrieving revision 1.696
diff -u -p -r1.696 parse.y
--- sbin/pfctl/parse.y  8 May 2019 21:31:30 -   1.696
+++ sbin/pfctl/parse.y  4 Jul 2019 14:34:29 -
@@ -83,7 +83,9 @@ intpopfile(void);
 int check_file_secrecy(int, const char *);
 int yyparse(void);
 int yylex(void);
-int yyerror(const char *, ...);
+int yyerror(const char *, ...)
+__attribute__((__format__ (printf, 1, 2)))
+__attribute__((__nonnull__ (1)));
 int kw_cmp(const void *, const void *);
 int lookup(char *);
 int igetc(void);
@@ -1041,7 +1043,7 @@ scrub_opt : NODF  {
YYERROR;
}
if ($2 < 0 || $2 > 255) {
-   yyerror("illegal min-ttl value %d", $2);
+   yyerror("illegal min-ttl value %lld", $2);
YYERROR;
}
scrub_opts.marker |= FOM_MINTTL;
@@ -1053,7 +1055,7 @@ scrub_opt : NODF  {
YYERROR;
}
if ($2 < 0 || $2 > 65535) {
-   yyerror("illegal max-mss value %d", $2);
+   yyerror("illegal max-mss value %lld", $2);
YYERROR;
}
scrub_opts.marker |= FOM_MAXMSS;
@@ -2218,7 +2220,7 @@ filter_set: prio {
YYERROR;
}
if ($2 < 0 || $2 > 0x) {
-   yyerror("illegal delay value %d (0-%u)", $2,
+   yyerror("illegal delay value %lld (0-%u)", $2,
0x);
YYERROR;
}
@@ -2289,7 +2291,7 @@ blockspec : /* empty */   {
}
| RETURNRST '(' TTL NUMBER ')'  {
if ($4 < 0 || $4 > 255) {
-   yyerror("illegal ttl value %d", $4);
+   yyerror("illegal ttl value %lld", $4);
YYERROR;
}
$$.b2 = PFRULE_RETURNRST;
@@ -2339,7 +2341,7 @@ reticmpspec   : STRING{
u_int8_ticmptype;
 
if ($1 < 0 || $1 > 255) {
-   yyerror("invalid icmp code %lu", $1);
+   yyerror("invalid icmp code %lld", $1);
YYERROR;
}
icmptype = returnicmpdefault >> 8;
@@ -2358,7 +2360,7 @@ reticmp6spec  : STRING{
u_int8_ticmptype;
 
if ($1 < 0 || $1 > 255) {
-   yyerror("invalid icmp code %lu", $1);
+   yyerror("invalid icmp code %lld", $1);
YYERROR;
}
icmptype = returnicmp6default >> 8;
@@ -2788,7 +2790,7 @@ host  : STRING{
if (strlcpy($$->addr.v.rtlabelname, $2,
sizeof($$->addr.v.rtlabelname)) >=
sizeof($$->addr.v.rtlabelname)) {
-   yyerror("route label too long, max %u chars",
+   yyerror("route label too long, max %lu chars",
sizeof($$->addr.v.rtlabelname) - 1);
free($2);
free($$);
@@ -3014,7 +3016,7 @@ uid   : STRING{
}
| NUMBER{
if ($1 < 0 || $1 >= UID_MAX) {
-   yyerror("illegal uid value %lu", $1);
+   yyerror("illegal uid value %lld", $1);
YYERROR;
   

port NetBSD libc tests

2019-07-01 Thread Moritz Buhl
Hi,

I started porting some NetBSD libc tests. They use atf which is a lot more
than necessary for most tests.
Attached is a passing test file for access(2). I would continue the same
way with the other tests if people say it is ok to work around atf this
way. I only changed two includes in t_access.c and left the file as it is.
This is not possible for all other tests as NetBSD has e.g. fchroot and
a differently behaving dup3.

I appreciate comments and feedback.

Thanks,
mbuhl


Index: regress/lib/libc/sys/Makefile
===
RCS file: regress/lib/libc/sys/Makefile
diff -N regress/lib/libc/sys/Makefile
--- /dev/null   1 Jan 1970 00:00:00 -
+++ regress/lib/libc/sys/Makefile   1 Jul 2019 06:53:18 -
@@ -0,0 +1,10 @@
+PROGS +=   t_access
+
+.for t in ${PROGS}
+REGRESS_TARGETS+= run-$t
+run-$t: $t
+   @echo "\n $@ "
+   ./$t
+.endfor
+
+.include 
Index: regress/lib/libc/sys/atf-c.h
===
RCS file: regress/lib/libc/sys/atf-c.h
diff -N regress/lib/libc/sys/atf-c.h
--- /dev/null   1 Jan 1970 00:00:00 -
+++ regress/lib/libc/sys/atf-c.h27 Jun 2019 19:07:52 -
@@ -0,0 +1,66 @@
+/* $OpenBSD$   */
+/*
+ * Copyright (c) 2019 Moritz Buhl 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#if !defined(ATF_C_H)
+#define ATF_C_H
+
+#include 
+
+#include 
+
+/* error.h */
+
+/* macros.h */
+#define ATF_TC(fn) \
+void atf_##fn(void);   \
+void atf_head_##fn(void);  \
+void atf_body_##fn(void);  \
+void   \
+atf_##fn(void) \
+{  \
+   atf_head_##fn();\
+   atf_body_##fn();\
+}
+
+#define ATF_TC_WITH_CLEANUP(fn)
\
+void atf_##fn(void);   \
+void atf_head_##fn(void);  \
+void atf_body_##fn(void);  \
+void atf_cleanup_##fn(void);   \
+void   \
+atf_##fn(void) \
+{  \
+   atf_head_##fn();\
+   atf_body_##fn();\
+   atf_cleanup_##fn(); \
+}
+
+#define ATF_TC_HEAD(fn, tc)void atf_head_##fn(void)
+#define ATF_TC_BODY(fn, tc)void atf_body_##fn(void)
+#define ATF_TC_CLEANUP(fn, tc) void atf_cleanup_##fn(void)
+
+#define ATF_TP_ADD_TCS(tp) int main(void)
+#define ATF_TP_ADD_TC(tp, fn)  atf_##fn()
+
+#define atf_tc_set_md_var(tc, attr, fmt, ...)  \
+   printf(attr ": " fmt "\n", ##__VA_ARGS__)
+
+#define ATF_REQUIRE(exp)   if (!(exp)) err(1, __func__)
+
+#define atf_no_error() 0
+
+#endif /* !defined(ATF_C_H) */
Index: regress/lib/libc/sys/macros.h
===
RCS file: regress/lib/libc/sys/macros.h
diff -N regress/lib/libc/sys/macros.h
--- /dev/null   1 Jan 1970 00:00:00 -
+++ regress/lib/libc/sys/macros.h   27 Jun 2019 19:07:26 -
@@ -0,0 +1,7 @@
+/* $OpenBSD$   */
+/* Public domain - Moritz Buhl */
+
+#include 
+
+#define __RCSID(str)
+#define __arraycount(_a)   nitems(_a)
Index: regress/lib/libc/sys/t_access.c
===
RCS file: regress/lib/libc/sys/t_access.c
diff -N regress/lib/libc/sys/t_access.c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ regress/lib/libc/sys/t_access.c 27 Jun 2019 19:05:27 

problems with libm

2019-07-01 Thread Moritz Buhl
Hi,

while testing arm hardware on OpenBSD I noticed that some floating point
operations cause failures of other tests.
In fact the current libm is incorrect according to the ISO C Std Annex
G. I found this out after porting some FreeBSD lib msun tests. Many edge
cases for complex floating point operations are not covered at all.

My question on this is what I should do about this. Port the FreeBSD
msun library? Ignore the problem? Patch the current implementation?

I attached a test that is currently breaking. There are many more. To
fix these I just copied the FreeBSD file that implements the failing
function. But I am not sure if this is a good approach.

mbuhl

Index: regress/lib/libm/msun/Makefile
===
RCS file: /cvs/src/regress/lib/libm/msun/Makefile,v
retrieving revision 1.1.1.1
diff -u -p -r1.1.1.1 Makefile
--- regress/lib/libm/msun/Makefile  21 Feb 2019 16:14:03 -  1.1.1.1
+++ regress/lib/libm/msun/Makefile  31 May 2019 19:50:32 -
@@ -1,6 +1,7 @@
 # $OpenBSD: Makefile,v 1.1.1.1 2019/02/21 16:14:03 bluhm Exp $
 
 TESTS =
+TESTS +=   cexp_test
 TESTS +=   conj_test
 TESTS +=   fenv_test
 TESTS +=   lrint_test
Index: regress/lib/libm/msun/cexp_test.c
===
RCS file: regress/lib/libm/msun/cexp_test.c
diff -N regress/lib/libm/msun/cexp_test.c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ regress/lib/libm/msun/cexp_test.c   1 Jun 2019 05:40:51 -
@@ -0,0 +1,326 @@
+/*-
+ * Copyright (c) 2008-2011 David Schultz 
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *notice, this list of conditions and the following disclaimer in the
+ *documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+/*
+ * Tests for corner cases in cexp*().
+ */
+
+#include 
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test-utils.h"
+
+#pragma STDC FENV_ACCESS   ON
+#pragmaSTDC CX_LIMITED_RANGE   OFF
+
+/*
+ * Test that a function returns the correct value and sets the
+ * exception flags correctly. The exceptmask specifies which
+ * exceptions we should check. We need to be lenient for several
+ * reasons, but mainly because on some architectures it's impossible
+ * to raise FE_OVERFLOW without raising FE_INEXACT. In some cases,
+ * whether cexp() raises an invalid exception is unspecified.
+ *
+ * These are macros instead of functions so that assert provides more
+ * meaningful error messages.
+ *
+ * XXX The volatile here is to avoid gcc's bogus constant folding and work
+ * around the lack of support for the FENV_ACCESS pragma.
+ */
+#definetest(func, z, result, exceptmask, excepts, checksign)   do {
\
+   volatile long double complex _d = z;\
+   assert(feclearexcept(FE_ALL_EXCEPT) == 0);  \
+   assert(cfpequal_cs((func)(_d), (result), (checksign))); \
+   assert(((void)(func), fetestexcept(exceptmask) == (excepts)));  \
+} while (0)
+
+#define test_c(t, func, z, result, exceptmask, excepts, checksign) do { \
+volatile t complex r = result;  \
+test(func, z, r, exceptmask, excepts, checksign);   \
+} while (0)
+
+#define test_f(func, z, result, exceptmask, excepts, checksign) do {\
+test_c(float, func, z, result, exceptmask, excepts, checksign); \
+} while (0)
+
+/* Test within a given tolerance. */
+#definetest_tol(func, z, result, tol)  do {
\
+   volatile long double complex _d = z;\
+   assert(cfpequal_tol((func)(_d), (result), (tol),\
+   FPE_ABS_ZERO | CS_BOTH));   \
+} while 

Fix regress/sys/rtable for octeon

2019-06-23 Thread Moritz Buhl
Hi,
while running regression tests on an octeon ERL, I noticed that
no-macro-redefined flag was not known to the compiler.
I added some #undefs and some hours ago a change to art_walk made
the tests fail too.

The no-macro-redefine was added 3 month ago.

Thanks,
mbuhl

Index: regress/sys/net/rtable/Makefile.inc
===
RCS file: /mount/openbsd/cvs/src/regress/sys/net/rtable/Makefile.inc,v
retrieving revision 1.4
diff -u -p -r1.4 Makefile.inc
--- regress/sys/net/rtable/Makefile.inc 31 Mar 2019 14:03:40 -  1.4
+++ regress/sys/net/rtable/Makefile.inc 21 Jun 2019 18:26:06 -
@@ -9,6 +9,6 @@ SRCS+=  art.c
 CFLAGS+=   -DART
 .endif
 
-CPPFLAGS+= -I${TOPDIR} -Wall -Wno-macro-redefined
+CPPFLAGS+= -I${TOPDIR} -Wall
 
 .PATH: ${TOPDIR} ${TOPDIR}/../../../../sys/net
Index: regress/sys/net/rtable/srp_compat.h
===
RCS file: /mount/openbsd/cvs/src/regress/sys/net/rtable/srp_compat.h,v
retrieving revision 1.5
diff -u -p -r1.5 srp_compat.h
--- regress/sys/net/rtable/srp_compat.h 27 Jul 2017 13:34:30 -  1.5
+++ regress/sys/net/rtable/srp_compat.h 21 Jun 2019 14:38:33 -
@@ -48,7 +48,9 @@ srp_swap_locked(struct srp *srp, void *n
  */
 
 #define SRPL_INIT(_sl) SLIST_INIT(_sl)
+#undef SRPL_HEAD
 #define SRPL_HEAD(name, entry) SLIST_HEAD(name, entry)
+#undef SRPL_ENTRY
 #define SRPL_ENTRY(type)   SLIST_ENTRY(type)
 
 #define SRPL_FIRST(_sr, _sl)   SLIST_FIRST(_sl);
Index: regress/sys/net/rtable/delete/main.c
===
RCS file: /mount/openbsd/cvs/src/regress/sys/net/rtable/delete/main.c,v
retrieving revision 1.5
diff -u -p -r1.5 main.c
--- regress/sys/net/rtable/delete/main.c15 Nov 2016 10:43:41 -  
1.5
+++ regress/sys/net/rtable/delete/main.c21 Jun 2019 18:20:20 -
@@ -58,9 +58,9 @@ main(int argc, char *argv[])
 
do_from_file(0, AF_INET6, filename, route_insert);
 
-   rtable_walk(0, AF_INET6, rtentry_delete, NULL);
+   rtable_walk(0, AF_INET6, NULL, rtentry_delete, NULL);
 
-   rtable_walk(0, AF_INET6, rtentry_dump, NULL);
+   rtable_walk(0, AF_INET6, NULL, rtentry_dump, NULL);
 
 #ifdef ART
struct art_root *ar;
Index: regress/sys/net/rtable/fullfeed/main.c
===
RCS file: /mount/openbsd/cvs/src/regress/sys/net/rtable/fullfeed/main.c,v
retrieving revision 1.3
diff -u -p -r1.3 main.c
--- regress/sys/net/rtable/fullfeed/main.c  15 Nov 2016 10:43:41 -  
1.3
+++ regress/sys/net/rtable/fullfeed/main.c  21 Jun 2019 18:20:38 -
@@ -53,7 +53,7 @@ main(int argc, char *argv[])
do_from_file(0, af, filename, route_insert);
do_from_file(0, af, filename, route_lookup);
 
-   rtable_walk(0, af, rtentry_dump, NULL);
+   rtable_walk(0, af, NULL, rtentry_dump, NULL);
 
do_from_file(0, af, filename, route_delete);
 



iked regression test with obj dir

2019-06-22 Thread Moritz Buhl
Hi,
I noticed that the iked regress test fails if I have an obj directory.
The following patch adresses this.

mbuhl

Index: regress/sbin/iked/parser/Makefile
===
RCS file: /mount/openbsd/cvs//src/regress/sbin/iked/parser/Makefile,v
retrieving revision 1.2
diff -u -p -r1.2 Makefile
--- regress/sbin/iked/parser/Makefile   30 May 2017 15:36:13 -  1.2
+++ regress/sbin/iked/parser/Makefile   21 Jun 2019 02:06:02 -
@@ -17,7 +17,7 @@ test_parser: ${IKEOBJS}

 ${IKEOBJS}:
cd ${.CURDIR}/../../../../sbin/iked && make $@
-   ln -sf ${.OBJDIR}/../../../../sbin/iked/$@ .
+   ln -sf ${.CURDIR}/../../../../sbin/iked/$@ .

 LDADD+=-L${.OBJDIR} -ltest_helper
 DPADD+=libtest_helper.a



user mod -u lead to segfault

2018-10-13 Thread Moritz Buhl
Hi,

when using user mod -u, my machine ran into a segfault and the passwd
file was left locked.

I belief this is due to the changes to getpwnam etc.
Sadly I am unable to figure out, why exactly the call to getpwuid
segfaults. But replacing it with a call to user_from_uid helped.
I could not crash the program with other flags that use getpw{nam,uid}.

Thanks,
Moritz Buhl


Index: user.c
===
RCS file: /cvs/src/usr.sbin/user/user.c,v
retrieving revision 1.122
diff -u -p -r1.122 user.c
--- user.c  26 Sep 2018 14:54:58 -  1.122
+++ user.c  8 Oct 2018 17:00:16 -
@@ -1515,7 +1515,8 @@ moduser(char *login_name, char *newlogin
}
if (up->u_flags & F_UID) {
/* check uid isn't already allocated */
-   if (!(up->u_flags & F_DUPUID) && 
getpwuid((uid_t)(up->u_uid)) != NULL) {
+   if (!(up->u_flags & F_DUPUID) &&
+   user_from_uid((uid_t)(up->u_uid), 1) != NULL) {
close(ptmpfd);
pw_abort();
errx(EXIT_FAILURE, "uid %u is already in use", 
up->u_uid);



top cpu stats are wrong with hyper threading

2018-10-05 Thread Moritz Buhl
Hi,

Due to the SMT stuff the output of top showed the first few cpus instead
of the ones that are actually active.

To reproduce the bad output:
Use a machine with hyper therading, top should show half the cpus, of
which every second is disabled.

The following diff skips the disabled cpus and disabling/reenabling the
cores with hw.smt also works.
The only problem is that the lines reserved for cpu-stats does not
change with reenabling. Refreshing the output with '1' or resizing the
window should help.

Thanks,
Moritz Buhl


Index: display.c
===
RCS file: /home/mbuhl/cvs/src/usr.bin/top/display.c,v
retrieving revision 1.55
diff -u -p -r1.55 display.c
--- display.c   26 Sep 2018 17:23:13 -  1.55
+++ display.c   5 Oct 2018 15:19:00 -
@@ -381,7 +381,7 @@ cpustates_tag(int cpu)
 void
 i_cpustates(int64_t *ostates, int *online)
 {
-   int i, first, cpu, ncpuonline;
+   int i, first, cpu, ncpuonline, off;
double value;
int64_t *states;
char **names, *thisname;
@@ -434,15 +434,18 @@ i_cpustates(int64_t *ostates, int *onlin
}
return;
}
+   off = 0;
for (cpu = 0; cpu < ncpu; cpu++) {
/* now walk thru the names and print the line */
names = cpustate_names;
first = 0;
states = ostates + (CPUSTATES * cpu);
 
-   if ((screen_length > 2 + cpu && 2 + cpu < y_mem) ||
+   if (y_mem == ncpuonline + 2 && !online[cpu])
+   continue;
+   if ((screen_length > 2 + cpu && 2 + off < y_mem) ||
!smart_terminal) {
-   move(2 + cpu, 0);
+   move(2 + off, 0);
clrtoeol();
addstrp(cpustates_tag(cpu));
 
@@ -465,6 +468,7 @@ i_cpustates(int64_t *ostates, int *onlin
}
putn();
}
+   off++;
}
 }
 



wdc polling lead to unclean fs and panics

2018-10-04 Thread Moritz Buhl
Hi,

when rebooting a PowerBook G4, the kernel sometimes paniced while
detaching the disk.
Additionallly, during boot it was noted that / is not clean.
The following patch special cases polling to not return early anymore
and adds polling to all previous elements in the queue.
Please tell me if this is not the right way to address the problem.

Thanks,
mbuhl

Index: dev/ic/wdc.c
===
RCS file: /cvs/src/sys/dev/ic/wdc.c,v
retrieving revision 1.134
diff -u -p -r1.134 wdc.c
--- dev/ic/wdc.c30 Dec 2017 23:08:29 -  1.134
+++ dev/ic/wdc.c2 Oct 2018 14:01:04 -
@@ -888,7 +888,8 @@ wdcstart(struct channel_softc *chp)
/* adjust chp, in case we have a shared queue */
chp = xfer->chp;
 
-   if ((chp->ch_flags & WDCF_ACTIVE) != 0 ) {
+   if ((chp->ch_flags & WDCF_ACTIVE) != 0 &&
+   (xfer->c_flags & C_POLL) == 0) {
return; /* channel already active */
}
 #ifdef DIAGNOSTIC
@@ -1905,6 +1906,7 @@ wdccommandshort(struct channel_softc *ch
 void
 wdc_exec_xfer(struct channel_softc *chp, struct wdc_xfer *xfer)
 {
+   struct wdc_xfer *iter;
WDCDEBUG_PRINT(("wdc_exec_xfer %p flags 0x%x channel %d drive %d\n",
xfer, xfer->c_flags, chp->channel, xfer->drive), DEBUG_XFERS);
 
@@ -1918,7 +1920,8 @@ wdc_exec_xfer(struct channel_softc *chp,
 */
if ((xfer->c_flags & C_POLL) != 0 &&
!TAILQ_EMPTY(>ch_queue->sc_xfer)) {
-   TAILQ_INIT(>ch_queue->sc_xfer);
+   TAILQ_FOREACH(iter, >ch_queue->sc_xfer, c_xferchain) 
+   iter->c_flags |= C_POLL;
}
/* insert at the end of command list */
TAILQ_INSERT_TAIL(>ch_queue->sc_xfer,xfer , c_xferchain);



Unplugging USB network stick during tcpdump crashes system

2018-07-04 Thread Moritz Buhl
Hello,

Removing my usb network interface during `tcpdump -i rum0` caused my
kernel to crash. A fix is attached.

Thanks,
Moritz Buhl

Index: net/bpf.c
===
RCS file: /cvs/src/sys/net/bpf.c,v
retrieving revision 1.169
diff -u -p -r1.169 bpf.c
--- net/bpf.c   2 Mar 2018 16:57:41 -   1.169
+++ net/bpf.c   4 Jul 2018 19:42:35 -
@@ -332,7 +332,8 @@ bpf_detachd(struct bpf_d *d)
mtx_enter(>bd_mtx);
bpf_put(d);
 
-   if (error && !(error == EINVAL || error == ENODEV))
+   if (error && !(error == EINVAL || error == ENODEV ||
+   error == ENXIO))
/*
 * Something is really wrong if we were able to put
 * the driver into promiscuous mode, but can't



Re: Remove vgrind in ctags

2016-11-08 Thread Moritz Buhl

Sorry, I noticed that vgrind still is in the ports after writing the mail.
I will do more research before writing a patch next time.


On 11/08/16 11:49, Theo Buehler wrote:

On Tue, Nov 08, 2016 at 11:42:17AM +0100, Theo Buehler wrote:

On Tue, Nov 08, 2016 at 10:53:37AM +0100, Moritz Buhl wrote:

Hi, since vgrind was removed in 4.9 I removed the -v option in ctags.
At least the example in ctags(1) could be modified.

Greetings Moritz Buhl

The diff got mangled by your mail program, please resend it.  The
synopsis in ctags.1 would also need to be adjusted by removing v.


? ctags-vgrind.diff

and is there some part of your diff missing? vflag also appears in
ctags.h and print.c





Remove vgrind in ctags

2016-11-08 Thread Moritz Buhl

Hi, since vgrind was removed in 4.9 I removed the -v option in ctags.
At least the example in ctags(1) could be modified.

Greetings Moritz Buhl

? ctags-vgrind.diff

Index: ctags.1

===

RCS file: /cvs/src/usr.bin/ctags/ctags.1,v

retrieving revision 1.33

diff -u -p -r1.33 ctags.1

--- ctags.1 31 Dec 2015 14:01:26 - 1.33

+++ ctags.1 8 Nov 2016 09:37:27 -

@@ -91,19 +91,6 @@ Update the specified files in the

file; that is, all

references to them are regenerated, keeping only the other values in the

file.

-.It Fl v

-An index of the form expected by vgrind

-is produced on the standard output.

-This listing contains the object name, file name, and page number 
(assuming


-64 line pages).

-Since the output will be sorted into lexicographic order,

-it may be desired to run the output through

-.Xr sort 1 .

-Sample use:

-.Bd -literal -offset indent

-$ ctags -v files | sort -f > index

-$ vgrind -x index

-.Ed

.It Fl w

Suppress warning diagnostics.

.It Fl x

Index: ctags.c

===

RCS file: /cvs/src/usr.bin/ctags/ctags.c,v

retrieving revision 1.18

diff -u -p -r1.18 ctags.c

--- ctags.c 9 Oct 2015 01:37:07 - 1.18

+++ ctags.c 8 Nov 2016 09:37:27 -

@@ -55,7 +55,6 @@ long lineftell; /* ftell after getc( in

int lineno; /* line number of current line */

int dflag; /* -d: non-macro defines */

-int vflag; /* -v: vgrind style index output */

int wflag; /* -w: suppress warnings */

int xflag; /* -x: cxref style output */

@@ -81,7 +80,7 @@ main(int argc, char *argv[])

err(1, "pledge");

aflag = uflag = NO;

- while ((ch = getopt(argc, argv, "BFadf:tuwvx")) != -1)

+ while ((ch = getopt(argc, argv, "BFadf:tuwx")) != -1)

switch(ch) {

case 'B':

searchar = '?';

@@ -107,8 +106,6 @@ main(int argc, char *argv[])

case 'w':

wflag = 1;

break;

- case 'v':

- vflag = 1;

case 'x':

xflag = 1;

break;

@@ -120,12 +117,12 @@ main(int argc, char *argv[])

argc -= optind;

if (!argc) {

usage: (void)fprintf(stderr,

- "usage: ctags [-aBdFuvwx] [-f tagsfile] file ...\n");

+ "usage: ctags [-aBdFuwx] [-f tagsfile] file ...\n");

exit(1);

}

init();

- if (uflag && !vflag && !xflag)

+ if (uflag && !xflag)

preload_entries(outfile, argc, argv);

for (exit_val = step = 0; step < argc; ++step)