Re: pf af-to route output

2016-11-22 Thread Alexandr Nedvedicky
On Wed, Nov 23, 2016 at 01:43:59AM +0100, Mike Belopuhov wrote:
> On 23 November 2016 at 01:42, Alexander Bluhm  wrote:
> > On Tue, Nov 22, 2016 at 01:44:09PM +0100, Mike Belopuhov wrote:
> >> OK, all I wanted to know was if this is know to work and if it has
> >> been tested.  I'd argue that we don't put the code that doesn't work
> >> or not tested or we don't know what it does :)
> >
> > After looking at all the cases, it will be hard to test the at-to
> > with route-to combinations.  As the feature never worked, let's
> > disable it.  If someone has a usecase, he can put it back.
> >
> > ok?
> >
> > bluhm
> >
> 
> I agree.  If someone makes it work and gets it tested,
> we can consider including the code.

no objection here too. I like bluhm's patch.



Re: smtpd: more internal cleanups

2016-11-22 Thread Gilles Chehade
On Wed, Nov 23, 2016 at 09:14:13AM +0530, Sunil Nimmagadda wrote:
> 
> Eric Faurot writes:
> 
> > Hi.
> >
> > When using the internal io api, the caller had to explicitely reset an
> > io event in some cases (basically when data is buffered outside of an
> > io callback context) which could easily be missed.  This has been the
> > cause of some of smtpd bugs in the past (hanging sessions).
> >
> > After the recent changes, it's now possible to make it a lot simpler
> > by triggering an event reload internally when data is queued. So the
> > api user does not have to worry about it.
> >
> > Eric.
> 
> Ok sunil@
> 

been running with my server patched since yesterday, no regression

ok gilles@


> >
> > Index: ioev.c
> > ===
> > RCS file: /cvs/src/usr.sbin/smtpd/ioev.c,v
> > retrieving revision 1.31
> > diff -u -p -r1.31 ioev.c
> > --- ioev.c  22 Nov 2016 07:28:42 -  1.31
> > +++ ioev.c  22 Nov 2016 22:24:25 -
> > @@ -370,13 +370,25 @@ io_set_write(struct io *io)
> >  int
> >  io_write(struct io *io, const void *buf, size_t len)
> >  {
> > -   return iobuf_queue(io->iobuf, buf, len);
> > +   int r;
> > +
> > +   r = iobuf_queue(io->iobuf, buf, len);
> > +
> > +   io_reload(io);
> > +
> > +   return r;
> >  }
> >  
> >  int
> >  io_writev(struct io *io, const struct iovec *iov, int iovcount)
> >  {
> > -   return iobuf_queuev(io->iobuf, iov, iovcount);
> > +   int r;
> > +
> > +   r = iobuf_queuev(io->iobuf, iov, iovcount);
> > +
> > +   io_reload(io);
> > +
> > +   return r;
> >  }
> >  
> >  int
> > Index: mta_session.c
> > ===
> > RCS file: /cvs/src/usr.sbin/smtpd/mta_session.c,v
> > retrieving revision 1.89
> > diff -u -p -r1.89 mta_session.c
> > --- mta_session.c   22 Nov 2016 07:28:42 -  1.89
> > +++ mta_session.c   22 Nov 2016 22:24:26 -
> > @@ -280,7 +280,6 @@ mta_session_imsg(struct mproc *p, struct
> > mta_flush_task(s, IMSG_MTA_DELIVERY_TEMPFAIL,
> > "Could not get message fd", 0, 0);
> > mta_enter_state(s, MTA_READY);
> > -   io_reload(&s->io);
> > return;
> > }
> >  
> > @@ -289,7 +288,6 @@ mta_session_imsg(struct mproc *p, struct
> > fatal("mta: fdopen");
> >  
> > mta_enter_state(s, MTA_MAIL);
> > -   io_reload(&s->io);
> > return;
> >  
> > case IMSG_MTA_DNS_PTR:
> > @@ -366,7 +364,6 @@ mta_session_imsg(struct mproc *p, struct
> >  
> > mta_tls_verified(s);
> > io_resume(&s->io, IO_PAUSE_IN);
> > -   io_reload(&s->io);
> > return;
> >  
> > case IMSG_MTA_LOOKUP_HELO:
> > @@ -456,7 +453,6 @@ mta_on_timeout(struct runq *runq, void *
> > s->hangon++;
> >  
> > mta_enter_state(s, MTA_READY);
> > -   io_reload(&s->io);
> >  }
> >  
> >  static void
> > Index: smtp_session.c
> > ===
> > RCS file: /cvs/src/usr.sbin/smtpd/smtp_session.c,v
> > retrieving revision 1.295
> > diff -u -p -r1.295 smtp_session.c
> > --- smtp_session.c  22 Nov 2016 07:28:42 -  1.295
> > +++ smtp_session.c  22 Nov 2016 22:24:26 -
> > @@ -738,14 +738,12 @@ smtp_session_imsg(struct mproc *p, struc
> > smtp_filter_tx_rollback(s);
> > smtp_tx_free(s->tx);
> > smtp_reply(s, "%d %s", 530, "Sender rejected");
> > -   io_reload(&s->io);
> > break;
> > case LKA_TEMPFAIL:
> > smtp_filter_tx_rollback(s);
> > smtp_tx_free(s->tx);
> > smtp_reply(s, "421 %s: Temporary Error",
> > esc_code(ESC_STATUS_TEMPFAIL, 
> > ESC_OTHER_MAIL_SYSTEM_STATUS));
> > -   io_reload(&s->io);
> > break;
> > }
> > return;
> > @@ -766,7 +764,6 @@ smtp_session_imsg(struct mproc *p, struc
> > case LKA_TEMPFAIL:
> > smtp_reply(s, "%s", line);
> > }
> > -   io_reload(&s->io);
> > return;
> >  
> > case IMSG_SMTP_LOOKUP_HELO:
> > @@ -802,7 +799,6 @@ smtp_session_imsg(struct mproc *p, struc
> > smtp_enter_state(s, STATE_QUIT);
> > }
> > m_end(&m);
> > -   io_reload(&s->io);
> > return;
> >  
> > case IMSG_SMTP_MESSAGE_OPEN:
> > @@ -818,7 +814,6 @@ smtp_session_imsg(struct mproc *p, struc
> > smtp_reply(s, "421 %s: Temporary Error",
> > esc_code(ESC_STATUS_TEMPFAIL, 
> > ESC_OTHER_MAIL_SYSTEM_STATUS));
> > smtp_enter_state(s, STATE_QUIT);
> > -   io_reload(&s->io);
> > return;
> > }
> >  
> > @@ -872,7 +867,6 @@ smtp_session_imsg(struct

Re: Unnecessary goto in ip_output()

2016-11-22 Thread Claudio Jeker
On Tue, Nov 22, 2016 at 04:55:17PM +0100, Martin Pieuchot wrote:
> After the last IPSEC-related refactoring this goto no longer make sense.
> 
> ok?

Are you shure? I'm not convinced that for an INADDR_BROADCAST destination
the code would do the same. I think it is fine but I can't prove it.
 
> Index: netinet/ip_output.c
> ===
> RCS file: /cvs/src/sys/netinet/ip_output.c,v
> retrieving revision 1.330
> diff -u -p -r1.330 ip_output.c
> --- netinet/ip_output.c   18 Nov 2016 02:53:47 -  1.330
> +++ netinet/ip_output.c   22 Nov 2016 15:51:30 -
> @@ -249,11 +249,6 @@ reroute:
>* computation now.
>*/
>   in_proto_cksum_out(m, NULL);
> -
> - /* If it's not a multicast packet, try to fast-path */
> - if (!IN_MULTICAST(ip->ip_dst.s_addr)) {
> - goto sendit;
> - }
>   }
>   }
>  #endif /* IPSEC */
> 

-- 
:wq Claudio



armv7/stand/efiboot: working BOOTAA64.EFI payload

2016-11-22 Thread Ian Sutton
Hi,

Patch adds armv8/aarch64 EFI payload image build support. With
patrick@'s aarch64-none-elf- toolchain referenced in my last mail on
this list, you can build it via...

$ MACHINE=armv8 armmake64-aarch64 -f Makefile.arm64

...where armmake64-aarch64 is a simple env wrapper for the toolchain,
uploaded here:

https://ce.gl/armmake64-aarch64.txt

I've tested this on the pine64, it works up until boot(0) is called
which fails as there is no bootable disk/aarch64 kernel yet.

I've put this in the armv7 directory pending a proper
armv8/aarch64/arm64 one.

Ian

Index: sys/arch/arm/include/reloc.h
===
RCS file: /cvs/src/sys/arch/arm/include/reloc.h,v
retrieving revision 1.2
diff -u -p -r1.2 reloc.h
--- sys/arch/arm/include/reloc.h26 May 2006 20:22:04 -  1.2
+++ sys/arch/arm/include/reloc.h23 Nov 2016 03:54:28 -
@@ -52,3 +52,19 @@
 #define R_ARM_RPC24254
 #define R_ARM_RBASE255
 
+#defineR_AARCH64_NONE  0   /* No relocation */
+#defineR_AARCH64_ABS64 257 /* Absolute offset */
+#defineR_AARCH64_ABS32 258 /* Absolute, 32-bit overflow 
check */
+#defineR_AARCH64_ABS16 259 /* Absolute, 16-bit overflow 
check */
+#defineR_AARCH64_PREL64260 /* PC relative */
+#defineR_AARCH64_PREL32261 /* PC relative, 32-bit overflow 
check */
+#defineR_AARCH64_PREL16262 /* PC relative, 16-bit overflow 
check */
+#defineR_AARCH64_COPY  1024/* Copy data from shared object 
*/
+#defineR_AARCH64_GLOB_DAT  1025/* Set GOT entry to data 
address */
+#defineR_AARCH64_JUMP_SLOT 1026/* Set GOT entry to code 
address */
+#defineR_AARCH64_RELATIVE  1027/* Add load address of shared 
object */
+#defineR_AARCH64_TLS_DTPREL64  1028
+#defineR_AARCH64_TLS_DTPMOD64  1029
+#defineR_AARCH64_TLS_TPREL64   1030
+#defineR_AARCH64_TLSDESC   1031/* Identify the TLS descriptor 
*/
+#defineR_AARCH64_IRELATIVE 1032
Index: sys/arch/armv7/stand/efiboot/Makefile.arm64
===
RCS file: sys/arch/armv7/stand/efiboot/Makefile.arm64
diff -N sys/arch/armv7/stand/efiboot/Makefile.arm64
--- /dev/null   1 Jan 1970 00:00:00 -
+++ sys/arch/armv7/stand/efiboot/Makefile.arm64 23 Nov 2016 03:54:28 -
@@ -0,0 +1,80 @@
+#  $OpenBSD: Makefile,v 1.6 2016/11/06 16:42:00 tb Exp $
+
+NOMAN= #
+
+.if ${MACHINE} == "armv8"
+
+PROG=  BOOTAA64.EFI
+OBJFMT=binary
+INSTALL_STRIP=
+BINDIR=/usr/mdec
+SRCS=  start64.S self_reloc.c efiboot64.c conf.c exec64.c efidev64.c 
fdt.c
+
+S= ${.CURDIR}/../../../..
+EFIDIR=${S}/stand/efi
+
+OBJCOPY?=  objcopy
+OBJDUMP?=  objdump
+
+LDFLAGS+=-nostdlib -T ${.CURDIR}/ldscript.arm64 -Bsymbolic -shared
+
+.PATH: ${S}/stand/boot
+SRCS+= boot.c cmd.c vars.c
+
+.PATH: ${S}/lib/libsa
+SRCS+= alloc.c ctime.c exit.c getchar.c memcmp.o memcpy.o memmove.c memset.c \
+   printf.c putchar.c snprintf.c strchr.c strcmp.c strerror.c strncmp.c \
+   strncpy.c strtol.c
+SRCS+= close.c closeall.c cons.c cread.c dev.c disklabel.c dkcksum.c fstat.c \
+   lseek.c open.c read.c readdir.c stat.c
+SRCS+= loadfile.c
+SRCS+= ufs.c
+
+.PATH: ${S}/lib/libkern/arch/arm ${S}/lib/libkern
+#SRCS+=divsi3.S divdi3.c moddi3.c qdivrem.c strlcpy.c strlen.c
+SRCS+= divdi3.c moddi3.c qdivrem.c strlcpy.c strlen.c
+
+.PATH: ${S}/lib/libz
+SRCS+= adler32.c crc32.c inflate.c inftrees.c
+
+CPPFLAGS+= -nostdinc
+CPPFLAGS+= -I${S} -I. -I${.CURDIR}
+CPPFLAGS+= -I${EFIDIR}/include -I${EFIDIR}/include/arm
+CPPFLAGS+= -D_STANDALONE
+CPPFLAGS+= -DSMALL -DSLOW -DNOBYFOUR -D__INTERNAL_LIBSA_CREAD
+CPPFLAGS+= -DNEEDS_HEAP_H
+COPTS+=-Wno-attributes -Wno-format
+COPTS+=-ffreestanding -fno-stack-protector
+COPTS+=-fshort-wchar -fPIC -fno-builtin
+COPTS+=-Wall -Werror
+
+PROG.elf=  ${PROG:S/.EFI/.elf/}
+CLEANFILES+=   ${PROG.elf} ${PROG.elf}.tmp
+
+${PROG}: ${PROG.elf}
+   ${OBJCOPY} -j .peheader -j .text -j .sdata -j .data \
+   -j .dynamic -j .dynsym -j .dynstr -j .rel -j .rel.dyn \
+   -j .rela -j .rela.dyn -j .reloc \
+   --output-target=${OBJFMT} ${PROG.elf} ${.TARGET}
+
+.include 
+
+${PROG.elf}: ${OBJS}
+   ${LD} ${LDFLAGS} -o ${.TARGET}.tmp ${OBJS} ${LDADD}
+   @if ${OBJDUMP} -t ${.TARGET}.tmp | grep 'UND'; then \
+   (echo Undefined symbols; false);\
+   fi
+   mv ${.TARGET}.tmp ${.TARGET}
+
+.if !make(clean) && !make(cleandir) && !make(includes) && !make(obj)
+.BEGIN:
+   @([ -h machine ] || ln -s ${.CURDIR}/../../../${MACHINE}/include 
machine)
+   @([ -h arm ] || ln -s

Re: smtpd: more internal cleanups

2016-11-22 Thread Sunil Nimmagadda

Eric Faurot writes:

> Hi.
>
> When using the internal io api, the caller had to explicitely reset an
> io event in some cases (basically when data is buffered outside of an
> io callback context) which could easily be missed.  This has been the
> cause of some of smtpd bugs in the past (hanging sessions).
>
> After the recent changes, it's now possible to make it a lot simpler
> by triggering an event reload internally when data is queued. So the
> api user does not have to worry about it.
>
> Eric.

Ok sunil@

>
> Index: ioev.c
> ===
> RCS file: /cvs/src/usr.sbin/smtpd/ioev.c,v
> retrieving revision 1.31
> diff -u -p -r1.31 ioev.c
> --- ioev.c22 Nov 2016 07:28:42 -  1.31
> +++ ioev.c22 Nov 2016 22:24:25 -
> @@ -370,13 +370,25 @@ io_set_write(struct io *io)
>  int
>  io_write(struct io *io, const void *buf, size_t len)
>  {
> - return iobuf_queue(io->iobuf, buf, len);
> + int r;
> +
> + r = iobuf_queue(io->iobuf, buf, len);
> +
> + io_reload(io);
> +
> + return r;
>  }
>  
>  int
>  io_writev(struct io *io, const struct iovec *iov, int iovcount)
>  {
> - return iobuf_queuev(io->iobuf, iov, iovcount);
> + int r;
> +
> + r = iobuf_queuev(io->iobuf, iov, iovcount);
> +
> + io_reload(io);
> +
> + return r;
>  }
>  
>  int
> Index: mta_session.c
> ===
> RCS file: /cvs/src/usr.sbin/smtpd/mta_session.c,v
> retrieving revision 1.89
> diff -u -p -r1.89 mta_session.c
> --- mta_session.c 22 Nov 2016 07:28:42 -  1.89
> +++ mta_session.c 22 Nov 2016 22:24:26 -
> @@ -280,7 +280,6 @@ mta_session_imsg(struct mproc *p, struct
>   mta_flush_task(s, IMSG_MTA_DELIVERY_TEMPFAIL,
>   "Could not get message fd", 0, 0);
>   mta_enter_state(s, MTA_READY);
> - io_reload(&s->io);
>   return;
>   }
>  
> @@ -289,7 +288,6 @@ mta_session_imsg(struct mproc *p, struct
>   fatal("mta: fdopen");
>  
>   mta_enter_state(s, MTA_MAIL);
> - io_reload(&s->io);
>   return;
>  
>   case IMSG_MTA_DNS_PTR:
> @@ -366,7 +364,6 @@ mta_session_imsg(struct mproc *p, struct
>  
>   mta_tls_verified(s);
>   io_resume(&s->io, IO_PAUSE_IN);
> - io_reload(&s->io);
>   return;
>  
>   case IMSG_MTA_LOOKUP_HELO:
> @@ -456,7 +453,6 @@ mta_on_timeout(struct runq *runq, void *
>   s->hangon++;
>  
>   mta_enter_state(s, MTA_READY);
> - io_reload(&s->io);
>  }
>  
>  static void
> Index: smtp_session.c
> ===
> RCS file: /cvs/src/usr.sbin/smtpd/smtp_session.c,v
> retrieving revision 1.295
> diff -u -p -r1.295 smtp_session.c
> --- smtp_session.c22 Nov 2016 07:28:42 -  1.295
> +++ smtp_session.c22 Nov 2016 22:24:26 -
> @@ -738,14 +738,12 @@ smtp_session_imsg(struct mproc *p, struc
>   smtp_filter_tx_rollback(s);
>   smtp_tx_free(s->tx);
>   smtp_reply(s, "%d %s", 530, "Sender rejected");
> - io_reload(&s->io);
>   break;
>   case LKA_TEMPFAIL:
>   smtp_filter_tx_rollback(s);
>   smtp_tx_free(s->tx);
>   smtp_reply(s, "421 %s: Temporary Error",
>   esc_code(ESC_STATUS_TEMPFAIL, 
> ESC_OTHER_MAIL_SYSTEM_STATUS));
> - io_reload(&s->io);
>   break;
>   }
>   return;
> @@ -766,7 +764,6 @@ smtp_session_imsg(struct mproc *p, struc
>   case LKA_TEMPFAIL:
>   smtp_reply(s, "%s", line);
>   }
> - io_reload(&s->io);
>   return;
>  
>   case IMSG_SMTP_LOOKUP_HELO:
> @@ -802,7 +799,6 @@ smtp_session_imsg(struct mproc *p, struc
>   smtp_enter_state(s, STATE_QUIT);
>   }
>   m_end(&m);
> - io_reload(&s->io);
>   return;
>  
>   case IMSG_SMTP_MESSAGE_OPEN:
> @@ -818,7 +814,6 @@ smtp_session_imsg(struct mproc *p, struc
>   smtp_reply(s, "421 %s: Temporary Error",
>   esc_code(ESC_STATUS_TEMPFAIL, 
> ESC_OTHER_MAIL_SYSTEM_STATUS));
>   smtp_enter_state(s, STATE_QUIT);
> - io_reload(&s->io);
>   return;
>   }
>  
> @@ -872,7 +867,6 @@ smtp_session_imsg(struct mproc *p, struc
>   esc_code(ESC_STATUS_OK, 
> ESC_DESTINATION_ADDRESS_VALID),
>   esc_description(ESC_DESTINATION_ADDRESS_VALID));
>   }
> - io_reload(&s->io);
>   return;
>  
>   case IMSG_SMTP_MESSAGE_COMMI

Re: use DRM_IOCTL_GET_PCIINFO in libdrm

2016-11-22 Thread Jonathan Gray
On Tue, Nov 22, 2016 at 09:29:03PM +0100, Mark Kettenis wrote:
> > Date: Sat, 19 Nov 2016 19:31:18 +1100
> > From: Jonathan Gray 
> > 
> > Support libdrm functions required for Mesa versions >= 13.
> > 
> > On linux this information is pulled out of a psuedo filesystem, here the
> > new DRM_IOCTL_GET_PCIINFO ioctl is used for the same.
> > 
> > Only primary drm nodes are handled, render and control nodes which we
> > don't have aren't.  This also only handles devices with PCI ids.
> > 
> > drmGetMinorNameForFD() based on code the Mesa loader used to have.
> 
> See my reply to the kernel diff.  One nit...
> 
> > diff --git xf86drm.c xf86drm.c
> > index 3c2c5d4..03fe257 100644
> > --- xf86drm.c
> > +++ xf86drm.c
> > @@ -62,6 +62,10 @@
> >  #endif
> >  #include 
> >  
> > +#ifdef __OpenBSD__
> > +#include 
> > +#endif
> > +
> >  /* Not all systems have MAP_FAILED defined */
> >  #ifndef MAP_FAILED
> >  #define MAP_FAILED ((void *)-1)
> > @@ -2851,7 +2855,25 @@ static char *drmGetMinorNameForFD(int fd, int type)
> >  out_close_dir:
> >  closedir(sysdir);
> >  #else
> > -#warning "Missing implementation of drmGetMinorNameForFD"
> > +struct stat sbuf;
> > +unsigned int maj, min;
> > +char buf[0x40];
> 
> Any reason not to use PATH_MAX + 1 here?

This was carried over from the old Mesa loader code, updated diff to
use PATH_MAX and inline ioctl definition:

diff --git xf86drm.c xf86drm.c
index 3c2c5d4..38f6440 100644
--- xf86drm.c
+++ xf86drm.c
@@ -103,7 +103,23 @@
 #endif
 
 #ifdef __OpenBSD__
+
 #define X_PRIVSEP
+
+struct drm_pciinfo {
+   uint16_tdomain;
+   uint8_t bus;
+   uint8_t dev;
+   uint8_t func;
+   uint16_tvendor_id;
+   uint16_tdevice_id;
+   uint16_tsubvendor_id;
+   uint16_tsubdevice_id;
+   uint8_t revision_id;
+};
+
+#define DRM_IOCTL_GET_PCIINFO  DRM_IOR(0x15, struct drm_pciinfo)
+
 #endif
 
 #define DRM_MSG_VERBOSITY 3
@@ -2851,7 +2867,25 @@ static char *drmGetMinorNameForFD(int fd, int type)
 out_close_dir:
 closedir(sysdir);
 #else
-#warning "Missing implementation of drmGetMinorNameForFD"
+struct stat sbuf;
+unsigned int maj, min;
+char buf[PATH_MAX + 1];
+int n;
+
+if (fstat(fd, &sbuf))
+return NULL;
+
+maj = major(sbuf.st_rdev);
+min = minor(sbuf.st_rdev);
+
+if (maj != DRM_MAJOR || !S_ISCHR(sbuf.st_mode))
+return NULL;
+
+n = snprintf(buf, sizeof(buf), DRM_DEV_NAME, DRM_DIR_NAME, min);
+if (n == -1 || n >= sizeof(buf))
+return NULL;
+
+return strdup(buf);
 #endif
 return NULL;
 }
@@ -2887,6 +2921,8 @@ static int drmParseSubsystemType(int maj, int min)
 return DRM_BUS_PCI;
 
 return -EINVAL;
+#elif defined(__OpenBSD__)
+   return DRM_BUS_PCI;
 #else
 #warning "Missing implementation of drmParseSubsystemType"
 return -EINVAL;
@@ -2929,6 +2965,26 @@ static int drmParsePciBusInfo(int maj, int min, 
drmPciBusInfoPtr info)
 info->func = func;
 
 return 0;
+#elif defined(__OpenBSD__)
+struct drm_pciinfo pinfo;
+int fd;
+
+fd = drmOpenMinor(min, 0, DRM_NODE_PRIMARY);
+if (fd < 0)
+return -errno;
+
+if (drmIoctl(fd, DRM_IOCTL_GET_PCIINFO, &pinfo)) {
+close(fd);
+return -errno;
+}
+close(fd);
+
+info->domain = pinfo.domain;
+info->bus = pinfo.bus;
+info->dev = pinfo.dev;
+info->func = pinfo.func;
+
+return 0;
 #else
 #warning "Missing implementation of drmParsePciBusInfo"
 return -EINVAL;
@@ -3004,6 +3060,37 @@ static int drmParsePciDeviceInfo(const char *d_name,
 device->subdevice_id = config[46] | (config[47] << 8);
 
 return 0;
+#elif defined(__OpenBSD__)
+struct drm_pciinfo pinfo;
+char buf[PATH_MAX + 1];
+int fd, n;
+
+n = snprintf(buf, sizeof(buf), "%s/%s", DRM_DIR_NAME, d_name);
+if (n == -1 || n >= sizeof(buf))
+return -errno;
+
+#ifndef X_PRIVSEP
+fd = open(buf, O_RDWR, 0);
+#else
+fd = priv_open_device(buf);
+#endif
+
+if (fd < 0)
+return -errno;
+
+if (drmIoctl(fd, DRM_IOCTL_GET_PCIINFO, &pinfo)) {
+close(fd);
+return -errno;
+}
+close(fd);
+
+device->vendor_id = pinfo.vendor_id;
+device->device_id = pinfo.device_id;
+device->revision_id = pinfo.revision_id;
+device->subvendor_id = pinfo.subvendor_id;
+device->subdevice_id = pinfo.subdevice_id;
+
+return 0;
 #else
 #warning "Missing implementation of drmParsePciDeviceInfo"
 return -EINVAL;
@@ -3117,6 +3204,46 @@ static void drmFoldDuplicatedDevices(drmDevicePtr 
local_devices[], int count)
  */
 int drmGetDevice(int fd, drmDevicePtr *device)
 {
+#ifdef __OpenBSD__
+drmDevicePtr d;
+struct stat sbuf;
+char node[PATH_MAX + 1];
+char d_name[PATH_MAX + 1];
+int maj, min, n;
+int ret;
+int max_count = 1;
+
+if (fd == -1 || device == NULL)
+return -EINVAL;
+
+if (fstat(fd, &sbuf))
+

Re: add a new DRM_IOCTL_GET_PCIINFO ioctl

2016-11-22 Thread Jonathan Gray
On Tue, Nov 22, 2016 at 09:26:21PM +0100, Mark Kettenis wrote:
> > Date: Sat, 19 Nov 2016 19:27:25 +1100
> > From: Jonathan Gray 
> > 
> > To pull pci information from the kernel for drm devices we need a common
> > drm ioctl.  This is a requirement for implementing functions in libdrm
> > which are used by Mesa >= 13.
> > 
> > To not clash with drm headers this is added via pciio.h at kettenis'
> > suggestion.
> > 
> > The ioctl number reuses that of DRM_IOCTL_ADD_MAP, a DRI1 ioctl
> > we dropped support for, to avoid using a number that might be later
> > used in linux.
> 
> Sorry for dropping the ball on this.  My original thought was that we
> would have a 'p' ioctl instead of a 'd' ioctl with a name that is more
> in line with the existing ioctls in .  Having a #define
> for DRM_IOCTL_GET_PCIINFO in  feels wrong.
> 
> Looking at your diffs, I wonder if we shouldn't just add the
> DRM_IOCTL_GET_PCIINFO define to  for the kernel,
> and put a copy (protectected by #ifdef __OpenBSD__) in
> libdrm/xf86drm.c.

Sure, updated diff

Index: drm.h
===
RCS file: /cvs/src/sys/dev/pci/drm/drm.h,v
retrieving revision 1.20
diff -u -p -r1.20 drm.h
--- drm.h   23 Sep 2015 23:12:11 -  1.20
+++ drm.h   23 Nov 2016 00:30:50 -
@@ -706,6 +706,20 @@ struct drm_event_vblank {
u_int32_treserved;
 };
 
+#ifdef __OpenBSD__
+struct drm_pciinfo {
+   uint16_tdomain;
+   uint8_t bus;
+   uint8_t dev;
+   uint8_t func;
+   uint16_tvendor_id;
+   uint16_tdevice_id;
+   uint16_tsubvendor_id;
+   uint16_tsubdevice_id;
+   uint8_t revision_id;
+};
+#endif
+
 #include "drm_mode.h"
 
 #define DRM_IOCTL_BASE 'd'
@@ -734,7 +748,11 @@ struct drm_event_vblank {
 #define DRM_IOCTL_BLOCKDRM_IOWR(0x12, struct drm_block)
 #define DRM_IOCTL_UNBLOCK  DRM_IOWR(0x13, struct drm_block)
 #define DRM_IOCTL_CONTROL  DRM_IOW( 0x14, struct drm_control)
+#ifdef __OpenBSD__
+#define DRM_IOCTL_GET_PCIINFO  DRM_IOR( 0x15, struct drm_pciinfo)
+#else
 #define DRM_IOCTL_ADD_MAP  DRM_IOWR(0x15, struct drm_map)
+#endif
 #define DRM_IOCTL_ADD_BUFS DRM_IOWR(0x16, struct drm_buf_desc)
 #define DRM_IOCTL_MARK_BUFSDRM_IOW( 0x17, struct drm_buf_desc)
 #define DRM_IOCTL_INFO_BUFSDRM_IOWR(0x18, struct drm_buf_info)
Index: drm_drv.c
===
RCS file: /cvs/src/sys/dev/pci/drm/drm_drv.c,v
retrieving revision 1.149
diff -u -p -r1.149 drm_drv.c
--- drm_drv.c   15 Sep 2016 02:00:17 -  1.149
+++ drm_drv.c   23 Nov 2016 00:30:51 -
@@ -81,6 +81,7 @@ intdrm_version(struct drm_device *, vo
 int drm_setversion(struct drm_device *, void *, struct drm_file *);
 int drm_getmagic(struct drm_device *, void *, struct drm_file *);
 int drm_authmagic(struct drm_device *, void *, struct drm_file *);
+int drm_getpciinfo(struct drm_device *, void *, struct drm_file *);
 int drm_file_cmp(struct drm_file *, struct drm_file *);
 SPLAY_PROTOTYPE(drm_file_tree, drm_file, link, drm_file_cmp);
 
@@ -120,6 +121,8 @@ static struct drm_ioctl_desc drm_ioctls[
DRM_IOCTL_DEF(DRM_IOCTL_SET_SAREA_CTX, drm_setsareactx, 
DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_GET_SAREA_CTX, drm_getsareactx, DRM_AUTH),
 #else
+   DRM_IOCTL_DEF(DRM_IOCTL_GET_PCIINFO, drm_getpciinfo, 
DRM_UNLOCKED|DRM_RENDER_ALLOW),
+
DRM_IOCTL_DEF(DRM_IOCTL_SET_SAREA_CTX, drm_noop, 
DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
DRM_IOCTL_DEF(DRM_IOCTL_GET_SAREA_CTX, drm_noop, DRM_AUTH),
 #endif
@@ -1345,5 +1348,23 @@ int drm_pcie_get_speed_cap_mask(struct d
 
DRM_INFO("probing gen 2 caps for device 0x%04x:0x%04x = %x/%x\n",
PCI_VENDOR(id), PCI_PRODUCT(id), lnkcap, lnkcap2);
+   return 0;
+}
+
+int
+drm_getpciinfo(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+   struct drm_pciinfo *info = data;
+
+   info->domain = 0;
+   info->bus = dev->pdev->bus->number;
+   info->dev = PCI_SLOT(dev->pdev->devfn);
+   info->func = PCI_FUNC(dev->pdev->devfn);
+   info->vendor_id = dev->pdev->vendor;
+   info->device_id = dev->pdev->device;
+   info->subvendor_id = dev->pdev->subsystem_vendor;
+   info->subdevice_id = dev->pdev->subsystem_device;
+   info->revision_id = 0;
+
return 0;
 }



Re: pf af-to route output

2016-11-22 Thread Mike Belopuhov
On 23 November 2016 at 01:42, Alexander Bluhm  wrote:
> On Tue, Nov 22, 2016 at 01:44:09PM +0100, Mike Belopuhov wrote:
>> OK, all I wanted to know was if this is know to work and if it has
>> been tested.  I'd argue that we don't put the code that doesn't work
>> or not tested or we don't know what it does :)
>
> After looking at all the cases, it will be hard to test the at-to
> with route-to combinations.  As the feature never worked, let's
> disable it.  If someone has a usecase, he can put it back.
>
> ok?
>
> bluhm
>

I agree.  If someone makes it work and gets it tested,
we can consider including the code.



Re: pf af-to route output

2016-11-22 Thread Alexander Bluhm
On Tue, Nov 22, 2016 at 01:44:09PM +0100, Mike Belopuhov wrote:
> OK, all I wanted to know was if this is know to work and if it has
> been tested.  I'd argue that we don't put the code that doesn't work
> or not tested or we don't know what it does :)

After looking at all the cases, it will be hard to test the at-to
with route-to combinations.  As the feature never worked, let's
disable it.  If someone has a usecase, he can put it back.

ok?

bluhm

Index: sys/net/pf.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/pf.c,v
retrieving revision 1.1001
diff -u -p -r1.1001 pf.c
--- sys/net/pf.c22 Nov 2016 19:29:54 -  1.1001
+++ sys/net/pf.c23 Nov 2016 00:00:30 -
@@ -6878,28 +6878,16 @@ done:
action = PF_DROP;
break;
}
-   if (r->rt) {
-   switch (pd.naf) {
-   case AF_INET:
-   pf_route(&pd, r, s);
-   break;
-   case AF_INET6:
-   pf_route6(&pd, r, s);
-   break;
-   }
-   }
-   if (pd.m) {
-   pd.m->m_pkthdr.pf.flags |= PF_TAG_GENERATED;
-   switch (pd.naf) {
-   case AF_INET:
-   ip_output(pd.m, NULL, NULL, 0, NULL, NULL, 0);
-   break;
-   case AF_INET6:
-   ip6_output(pd.m, NULL, NULL, 0, NULL, NULL);
-   break;
-   }
-   pd.m = NULL;
+   pd.m->m_pkthdr.pf.flags |= PF_TAG_GENERATED;
+   switch (pd.naf) {
+   case AF_INET:
+   ip_output(pd.m, NULL, NULL, 0, NULL, NULL, 0);
+   break;
+   case AF_INET6:
+   ip6_output(pd.m, NULL, NULL, 0, NULL, NULL);
+   break;
}
+   pd.m = NULL;
action = PF_PASS;
break;
 #endif /* INET6 */
Index: sbin/pfctl/parse.y
===
RCS file: /data/mirror/openbsd/cvs/src/sbin/pfctl/parse.y,v
retrieving revision 1.655
diff -u -p -r1.655 parse.y
--- sbin/pfctl/parse.y  26 Aug 2016 06:06:58 -  1.655
+++ sbin/pfctl/parse.y  23 Nov 2016 00:07:42 -
@@ -1530,6 +1530,11 @@ pfrule   : action dir logquick interface 
yyerror("af-to can only be used with direction 
in");
YYERROR;
}
+   if (($8.marker & FOM_AFTO) && $8.route.rt) {
+   yyerror("af-to cannot be used together with "
+   "route-to, reply-to, dup-to");
+   YYERROR;
+   }
r.af = $5;
 
if ($8.tag)



Re: vio(4): fixup crash on up/down

2016-11-22 Thread Mike Belopuhov
On Wed, Nov 23, 2016 at 00:42 +0100, Stefan Fritsch wrote:
> On Wednesday, 23 November 2016 00:34:42 CET Mike Belopuhov wrote:
> > Yes, after looking closer at virtio code I agree with you.
> > However, stop/init is purely OpenBSD specific action.  There
> > are no provisions or requirements from the hardware really.
> > Thus we can treat UP/DOWN as purely software state and don't
> > stop/reinit the PCI device.
> 
> I guess we could do that. But then we cannot free the mbufs on DOWN
> until the device has used them.

Diff to this effect is below.  Works on vmd and qemu (original
one didn't because I kept the virtio_reset).

> That sounds like an unnecessary waste of memory to me.
> 

This is not so much memory we lose and then if you up it again
you're going to have it all back.  We can revert to the present
behavior once vmd matures, in the meantime people won't have to
juggle diffs around in their trees :)

The diff is meant to be applied on top of the previous two.
Please note I've also added a virtio_stop_vq_intr to vio_stop.

diff --git sys/dev/pci/if_vio.c sys/dev/pci/if_vio.c
index 68e6636..156c16f 100644
--- sys/dev/pci/if_vio.c
+++ sys/dev/pci/if_vio.c
@@ -596,10 +596,13 @@ vio_attach(struct device *parent, struct device *self, 
void *aux)
ifmedia_set(&sc->sc_media, IFM_ETHER | IFM_AUTO);
vsc->sc_config_change = vio_config_change;
timeout_set(&sc->sc_txtick, vio_txtick, &sc->sc_vq[VQTX]);
timeout_set(&sc->sc_rxtick, vio_rxtick, &sc->sc_vq[VQRX]);
 
+   if_rxr_init(&sc->sc_rx_ring, 2 * ((ifp->if_hardmtu / MCLBYTES) + 1),
+   sc->sc_vq[VQRX].vq_num);
+
if_attach(ifp);
ether_ifattach(ifp);
 
return;
 
@@ -665,19 +668,14 @@ vio_init(struct ifnet *ifp)
 {
struct vio_softc *sc = ifp->if_softc;
struct virtio_softc *vsc = sc->sc_virtio;
 
vio_stop(ifp, 0);
-   if_rxr_init(&sc->sc_rx_ring, 2 * ((ifp->if_hardmtu / MCLBYTES) + 1),
-   sc->sc_vq[VQRX].vq_num);
-   virtio_reinit_start(vsc);
-   virtio_negotiate_features(vsc, vsc->sc_features, NULL);
virtio_start_vq_intr(vsc, &sc->sc_vq[VQRX]);
virtio_stop_vq_intr(vsc, &sc->sc_vq[VQTX]);
if (vsc->sc_nvqs >= 3)
virtio_start_vq_intr(vsc, &sc->sc_vq[VQCTL]);
-   virtio_reinit_end(vsc);
vio_populate_rx_mbufs(sc);
ifp->if_flags |= IFF_RUNNING;
ifq_clr_oactive(&ifp->if_snd);
vio_iff(sc);
vio_link_state(ifp);
@@ -692,18 +690,17 @@ vio_stop(struct ifnet *ifp, int disable)
 
timeout_del(&sc->sc_txtick);
timeout_del(&sc->sc_rxtick);
ifp->if_flags &= ~IFF_RUNNING;
ifq_clr_oactive(&ifp->if_snd);
-   /* only way to stop I/O and DMA is resetting... */
-   virtio_reset(vsc);
vio_rxeof(sc);
if (vsc->sc_nvqs >= 3)
vio_ctrleof(&sc->sc_vq[VQCTL]);
-   vio_tx_drain(sc);
-   if (disable)
-   vio_rx_drain(sc);
+
+   virtio_stop_vq_intr(vsc, &sc->sc_vq[VQRX]);
+   if (vsc->sc_nvqs >= 3)
+   virtio_stop_vq_intr(vsc, &sc->sc_vq[VQCTL]);
 
if (vsc->sc_nvqs >= 3) {
if (sc->sc_ctrl_inuse != FREE)
sc->sc_ctrl_inuse = RESET;
wakeup(&sc->sc_ctrl_inuse);
@@ -1029,22 +1026,26 @@ vio_rxeof(struct vio_softc *sc)
mlast = m;
bufs_left--;
}
 
if (bufs_left == 0) {
-   ml_enqueue(&ml, m0);
+   if (ifp->if_flags & IFF_RUNNING)
+   ml_enqueue(&ml, m0);
+   else
+   m_freem(m0);
m0 = NULL;
}
}
if (m0 != NULL) {
DBGPRINT("expected %d buffers, got %d", (int)hdr->num_buffers,
(int)hdr->num_buffers - bufs_left);
ifp->if_ierrors++;
m_freem(m0);
}
 
-   if_input(ifp, &ml);
+   if (ifp->if_flags & IFF_RUNNING)
+   if_input(ifp, &ml);
return r;
 }
 
 int
 vio_rx_intr(struct virtqueue *vq)



Re: vio(4): fixup crash on up/down

2016-11-22 Thread Stefan Fritsch
On Wednesday, 23 November 2016 00:34:42 CET Mike Belopuhov wrote:
> Yes, after looking closer at virtio code I agree with you.
> However, stop/init is purely OpenBSD specific action.  There
> are no provisions or requirements from the hardware really.
> Thus we can treat UP/DOWN as purely software state and don't
> stop/reinit the PCI device.

I guess we could do that. But then we cannot free the mbufs on DOWN until the 
device has used them. That sounds like an unnecessary waste of memory to me.




Relocate some of the vio(4) (re)init code into vio_init

2016-11-22 Thread Mike Belopuhov
This moves code that is supposed to be in init, but is in stop
right now.  Tested on qemu.  OK?

diff --git sys/dev/pci/if_vio.c sys/dev/pci/if_vio.c
index d8c9798..68e6636 100644
--- sys/dev/pci/if_vio.c
+++ sys/dev/pci/if_vio.c
@@ -662,14 +662,22 @@ vio_media_status(struct ifnet *ifp, struct ifmediareq 
*imr)
  */
 int
 vio_init(struct ifnet *ifp)
 {
struct vio_softc *sc = ifp->if_softc;
+   struct virtio_softc *vsc = sc->sc_virtio;
 
vio_stop(ifp, 0);
if_rxr_init(&sc->sc_rx_ring, 2 * ((ifp->if_hardmtu / MCLBYTES) + 1),
sc->sc_vq[VQRX].vq_num);
+   virtio_reinit_start(vsc);
+   virtio_negotiate_features(vsc, vsc->sc_features, NULL);
+   virtio_start_vq_intr(vsc, &sc->sc_vq[VQRX]);
+   virtio_stop_vq_intr(vsc, &sc->sc_vq[VQTX]);
+   if (vsc->sc_nvqs >= 3)
+   virtio_start_vq_intr(vsc, &sc->sc_vq[VQCTL]);
+   virtio_reinit_end(vsc);
vio_populate_rx_mbufs(sc);
ifp->if_flags |= IFF_RUNNING;
ifq_clr_oactive(&ifp->if_snd);
vio_iff(sc);
vio_link_state(ifp);
@@ -693,17 +701,10 @@ vio_stop(struct ifnet *ifp, int disable)
vio_ctrleof(&sc->sc_vq[VQCTL]);
vio_tx_drain(sc);
if (disable)
vio_rx_drain(sc);
 
-   virtio_reinit_start(vsc);
-   virtio_negotiate_features(vsc, vsc->sc_features, NULL);
-   virtio_start_vq_intr(vsc, &sc->sc_vq[VQRX]);
-   virtio_stop_vq_intr(vsc, &sc->sc_vq[VQTX]);
-   if (vsc->sc_nvqs >= 3)
-   virtio_start_vq_intr(vsc, &sc->sc_vq[VQCTL]);
-   virtio_reinit_end(vsc);
if (vsc->sc_nvqs >= 3) {
if (sc->sc_ctrl_inuse != FREE)
sc->sc_ctrl_inuse = RESET;
wakeup(&sc->sc_ctrl_inuse);
}



Re: vio(4): fixup crash on up/down

2016-11-22 Thread Mike Belopuhov
On Wed, Nov 23, 2016 at 00:09 +0100, s...@openbsd.org wrote:
> On Tue, 22 Nov 2016, Mike Belopuhov wrote:
> 
> > vmd hackers and users seem to be able to trigger the assertion
> > below every time they do down and then up (with a dhclient for
> > instance):
> > 
> >   panic: kernel diagnostic assertion "m != NULL" failed: file 
> > "/usr/src/sys/dev/pci/if_vio.c", line 1008
> >   Starting stack trace...
> >   panic() at panic+0x10b
> >   __assert() at __assert+0x25
> >   vio_rxeof() at vio_rxeof+0x1db
> >   vio_rx_intr() at vio_rx_intr+0x28
> >   virtio_check_vqs() at virtio_check_vqs+0x8c
> >   virtio_pci_legacy_intr() at virtio_pci_legacy_intr+0x71
> >   intr_handler() at intr_handler+0x67
> >   Xintr_legacy7() at Xintr_legacy7+0xdd
> >   --- interrupt ---
> > 
> > The culprit is that vio_rx_drain and vio_tx_drain functions
> > are not properly implemented: they m_freem mbufs attached to
> > ring slots but do nothing about unconfiguring the backend
> > virtqueue.  This results in us thinking that the ring is
> > empty when we UP the interface the next time, but then a
> > completion event arrives for the slot that doesn't have an
> > associated mbuf anymore.
> 
> No, that's wrong. vio_*_drain don't need to do anything with the rings 
> because the device is reset before they are called. (the virtio_reset() 
> call). It sounds like vmd does not implement the reset properly. When the 
> device is reset, it
> 
> * must stop all dma activities to the rings
> * must stop interrupts
> * should probably forget about the configured virtqueues.
> 
> It must do that before the io-port write to the status register finishes 
> and returns control to the guest.
> 
> As far as I can see, vio(4) does even the right thing if an interrupt 
> occurs between splnet() and virtio_reset(). It will call vio_rxeof() after 
> the reset, which will remove all used descriptors from the ring. Even if 
> vio_rx_intr() would be called after splx(), there cannot be any used 
> descriptors in the ring, the call to vio_rxeof() will return 0, and 
> vio_rx_intr() will do nothing. But this could be made more explicit (and 
> robust?) by putting a check for IF_RUNNING into vio_rx_intr().
> 
> > To fix this properly a virtqueue draining code has to be
> > implemented, possibly based on what FreeBSD has.  However,
> > there's nothing really wrong with leaving the rings the
> > way they are and "resuming" on UP, vio doesn't destroy
> > them anyway.
> 
> vio must reset the device because otherwise it could not remove 
> descriptors from the available ring before the device has used them. And 
> the device won't use them unless packets arrive.
> 
> > The other part of the diff is removing virtio_reinit_start
> > and virtio_negotiate_features.  I'm not sure why does it
> > make sense to re-negotiate features on stop, but reinit
> > functions mess around the receive ring and if I just leave
> > them there, the interface will not be operational on UP.
> 
> The reinit is necessary after reset.
> 
> Cheers,
> Stefan
> 
> 

Yes, after looking closer at virtio code I agree with you.
However, stop/init is purely OpenBSD specific action.  There
are no provisions or requirements from the hardware really.
Thus we can treat UP/DOWN as purely software state and don't
stop/reinit the PCI device.



Fixup tsleeps in vio(4) and make them interruptable

2016-11-22 Thread Mike Belopuhov
This fixes up tsleep priorities in vio(4) and makes sleeps
interruptable so that ifconfig can be killed with ^C.  Tested
on qemu with a previous diff that makes vio(4) hang there.

OK?

diff --git sys/dev/pci/if_vio.c sys/dev/pci/if_vio.c
index 97af2a2..d8c9798 100644
--- sys/dev/pci/if_vio.c
+++ sys/dev/pci/if_vio.c
@@ -281,11 +281,11 @@ int   vio_ctrl_rx(struct vio_softc *, int, int);
 intvio_set_rx_filter(struct vio_softc *);
 void   vio_iff(struct vio_softc *);
 intvio_media_change(struct ifnet *);
 void   vio_media_status(struct ifnet *, struct ifmediareq *);
 intvio_ctrleof(struct virtqueue *);
-void   vio_wait_ctrl(struct vio_softc *sc);
+intvio_wait_ctrl(struct vio_softc *sc);
 intvio_wait_ctrl_done(struct vio_softc *sc);
 void   vio_ctrl_wakeup(struct vio_softc *, enum vio_ctrl_state);
 intvio_alloc_mem(struct vio_softc *);
 intvio_alloc_dmamem(struct vio_softc *);
 void   vio_free_dmamem(struct vio_softc *);
@@ -1219,11 +1219,12 @@ vio_ctrl_rx(struct vio_softc *sc, int cmd, int onoff)
 
if (vsc->sc_nvqs < 3)
return ENOTSUP;
 
splassert(IPL_NET);
-   vio_wait_ctrl(sc);
+   if ((r = vio_wait_ctrl(sc)) != 0)
+   return r;
 
sc->sc_ctrl_cmd->class = VIRTIO_NET_CTRL_RX;
sc->sc_ctrl_cmd->command = cmd;
sc->sc_ctrl_rx->onoff = onoff;
 
@@ -1246,14 +1247,12 @@ vio_ctrl_rx(struct vio_softc *sc, int cmd, int onoff)
sizeof(*sc->sc_ctrl_rx), 1);
VIO_DMAMEM_ENQUEUE(sc, vq, slot, sc->sc_ctrl_status,
sizeof(*sc->sc_ctrl_status), 0);
virtio_enqueue_commit(vsc, vq, slot, 1);
 
-   if (vio_wait_ctrl_done(sc)) {
-   r = EIO;
+   if ((r = vio_wait_ctrl_done(sc)) != 0)
goto out;
-   }
 
VIO_DMAMEM_SYNC(vsc, sc, sc->sc_ctrl_cmd,
sizeof(*sc->sc_ctrl_cmd), BUS_DMASYNC_POSTWRITE);
VIO_DMAMEM_SYNC(vsc, sc, sc->sc_ctrl_rx,
sizeof(*sc->sc_ctrl_rx), BUS_DMASYNC_POSTWRITE);
@@ -1271,28 +1270,38 @@ vio_ctrl_rx(struct vio_softc *sc, int cmd, int onoff)
 out:
vio_ctrl_wakeup(sc, FREE);
return r;
 }
 
-void
+int
 vio_wait_ctrl(struct vio_softc *sc)
 {
-   while (sc->sc_ctrl_inuse != FREE)
-   tsleep(&sc->sc_ctrl_inuse, IPL_NET, "vio_wait", 0);
+   int r = 0;
+
+   while (sc->sc_ctrl_inuse != FREE) {
+   r = tsleep(&sc->sc_ctrl_inuse, PRIBIO|PCATCH, "viowait", 0);
+   if (r == EINTR)
+   return r;
+   }
sc->sc_ctrl_inuse = INUSE;
+
+   return r;
 }
 
 int
 vio_wait_ctrl_done(struct vio_softc *sc)
 {
int r = 0;
+
while (sc->sc_ctrl_inuse != DONE && sc->sc_ctrl_inuse != RESET) {
if (sc->sc_ctrl_inuse == RESET) {
-   r = 1;
+   r = EIO;
break;
}
-   tsleep(&sc->sc_ctrl_inuse, IPL_NET, "vio_wait", 0);
+   r = tsleep(&sc->sc_ctrl_inuse, PRIBIO|PCATCH, "viodone", 0);
+   if (r == EINTR)
+   break;
}
return r;
 }
 
 void
@@ -1334,11 +1343,12 @@ vio_set_rx_filter(struct vio_softc *sc)
splassert(IPL_NET);
 
if (vsc->sc_nvqs < 3)
return ENOTSUP;
 
-   vio_wait_ctrl(sc);
+   if ((r = vio_wait_ctrl(sc)) != 0)
+   return r;
 
sc->sc_ctrl_cmd->class = VIRTIO_NET_CTRL_MAC;
sc->sc_ctrl_cmd->command = VIRTIO_NET_CTRL_MAC_TABLE_SET;
 
VIO_DMAMEM_SYNC(vsc, sc, sc->sc_ctrl_cmd,
@@ -1364,14 +1374,12 @@ vio_set_rx_filter(struct vio_softc *sc)
sc->sc_ctrl_mac_tbl_mc->nentries * ETHER_ADDR_LEN, 1);
VIO_DMAMEM_ENQUEUE(sc, vq, slot, sc->sc_ctrl_status,
sizeof(*sc->sc_ctrl_status), 0);
virtio_enqueue_commit(vsc, vq, slot, 1);
 
-   if (vio_wait_ctrl_done(sc)) {
-   r = EIO;
+   if ((r = vio_wait_ctrl_done(sc)) != 0)
goto out;
-   }
 
VIO_DMAMEM_SYNC(vsc, sc, sc->sc_ctrl_cmd,
sizeof(*sc->sc_ctrl_cmd), BUS_DMASYNC_POSTWRITE);
VIO_DMAMEM_SYNC(vsc, sc, sc->sc_ctrl_mac_info,
VIO_CTRL_MAC_INFO_SIZE, BUS_DMASYNC_POSTWRITE);
@@ -1436,29 +1444,22 @@ vio_iff(struct vio_softc *sc)
 
/* set unicast address, VirtualBox wants that */
memcpy(sc->sc_ctrl_mac_tbl_uc->macs[0], ac->ac_enaddr, ETHER_ADDR_LEN);
sc->sc_ctrl_mac_tbl_uc->nentries = 1;
 
-   if (rxfilter) {
+   if (rxfilter)
sc->sc_ctrl_mac_tbl_mc->nentries = nentries;
-   r = vio_set_rx_filter(sc);
-   if (r != 0) {
-   rxfilter = 0;
-   allmulti = 1; /* fallback */
-   }
-   } else {
-   sc->sc_ctrl_mac_tbl_mc->nentries = 0;
-   vio_set_rx_filter(sc);
-   }
 
-   if (allmulti) {
-   r = vio_ctrl_rx(sc, VIRTIO_NET_CTRL_RX_ALLMULTI, 1);
-  

Re: small indentation and spelling diff

2016-11-22 Thread Larry Hynes
On 2016-11-22, Alexander Bluhm  wrote:
> On Mon, Nov 21, 2016 at 05:36:21PM -0700, Kyle Milz wrote:
>> - * everywhere. It wouldn't surprise me if several stacks accidently
>> + * everywhere. It wouldn't surprise me if several stacks accidentally

> /usr/share/dict/words has both variants.

If /usr/share/dict/words could talk, it might tell you that accidently
is listed as obsolete by the OED, that it is used approximately
1:15 vis-a-vis accidentally in US English and 1:28 in UK (and it
might suggest that a lot of those usages may well have been in
error).

If accidently had a man page it would state that accidently has
been obsoleted by accidentally, which is suggested for normal use.



Re: vio(4): fixup crash on up/down

2016-11-22 Thread Mike Larkin
On Wed, Nov 23, 2016 at 12:09:57AM +0100, s...@openbsd.org wrote:
> On Tue, 22 Nov 2016, Mike Belopuhov wrote:
> 
> > vmd hackers and users seem to be able to trigger the assertion
> > below every time they do down and then up (with a dhclient for
> > instance):
> > 
> >   panic: kernel diagnostic assertion "m != NULL" failed: file 
> > "/usr/src/sys/dev/pci/if_vio.c", line 1008
> >   Starting stack trace...
> >   panic() at panic+0x10b
> >   __assert() at __assert+0x25
> >   vio_rxeof() at vio_rxeof+0x1db
> >   vio_rx_intr() at vio_rx_intr+0x28
> >   virtio_check_vqs() at virtio_check_vqs+0x8c
> >   virtio_pci_legacy_intr() at virtio_pci_legacy_intr+0x71
> >   intr_handler() at intr_handler+0x67
> >   Xintr_legacy7() at Xintr_legacy7+0xdd
> >   --- interrupt ---
> > 
> > The culprit is that vio_rx_drain and vio_tx_drain functions
> > are not properly implemented: they m_freem mbufs attached to
> > ring slots but do nothing about unconfiguring the backend
> > virtqueue.  This results in us thinking that the ring is
> > empty when we UP the interface the next time, but then a
> > completion event arrives for the slot that doesn't have an
> > associated mbuf anymore.
> 
> No, that's wrong. vio_*_drain don't need to do anything with the rings 
> because the device is reset before they are called. (the virtio_reset() 
> call). It sounds like vmd does not implement the reset properly. When the 

It doesn't. At least not yet.



Re: vio(4): fixup crash on up/down

2016-11-22 Thread sf
On Tue, 22 Nov 2016, Mike Belopuhov wrote:

> vmd hackers and users seem to be able to trigger the assertion
> below every time they do down and then up (with a dhclient for
> instance):
> 
>   panic: kernel diagnostic assertion "m != NULL" failed: file 
> "/usr/src/sys/dev/pci/if_vio.c", line 1008
>   Starting stack trace...
>   panic() at panic+0x10b
>   __assert() at __assert+0x25
>   vio_rxeof() at vio_rxeof+0x1db
>   vio_rx_intr() at vio_rx_intr+0x28
>   virtio_check_vqs() at virtio_check_vqs+0x8c
>   virtio_pci_legacy_intr() at virtio_pci_legacy_intr+0x71
>   intr_handler() at intr_handler+0x67
>   Xintr_legacy7() at Xintr_legacy7+0xdd
>   --- interrupt ---
> 
> The culprit is that vio_rx_drain and vio_tx_drain functions
> are not properly implemented: they m_freem mbufs attached to
> ring slots but do nothing about unconfiguring the backend
> virtqueue.  This results in us thinking that the ring is
> empty when we UP the interface the next time, but then a
> completion event arrives for the slot that doesn't have an
> associated mbuf anymore.

No, that's wrong. vio_*_drain don't need to do anything with the rings 
because the device is reset before they are called. (the virtio_reset() 
call). It sounds like vmd does not implement the reset properly. When the 
device is reset, it

* must stop all dma activities to the rings
* must stop interrupts
* should probably forget about the configured virtqueues.

It must do that before the io-port write to the status register finishes 
and returns control to the guest.

As far as I can see, vio(4) does even the right thing if an interrupt 
occurs between splnet() and virtio_reset(). It will call vio_rxeof() after 
the reset, which will remove all used descriptors from the ring. Even if 
vio_rx_intr() would be called after splx(), there cannot be any used 
descriptors in the ring, the call to vio_rxeof() will return 0, and 
vio_rx_intr() will do nothing. But this could be made more explicit (and 
robust?) by putting a check for IF_RUNNING into vio_rx_intr().

> To fix this properly a virtqueue draining code has to be
> implemented, possibly based on what FreeBSD has.  However,
> there's nothing really wrong with leaving the rings the
> way they are and "resuming" on UP, vio doesn't destroy
> them anyway.

vio must reset the device because otherwise it could not remove 
descriptors from the available ring before the device has used them. And 
the device won't use them unless packets arrive.

> The other part of the diff is removing virtio_reinit_start
> and virtio_negotiate_features.  I'm not sure why does it
> make sense to re-negotiate features on stop, but reinit
> functions mess around the receive ring and if I just leave
> them there, the interface will not be operational on UP.

The reinit is necessary after reset.

Cheers,
Stefan


> 
> So far this has been tested under vmd by rzalamena@ and
> reyk@ but it would be nice to get it tested under the KVM.
> 
> 
> 
> diff --git sys/dev/pci/if_vio.c sys/dev/pci/if_vio.c
> index 97af2a2..96b5fdf 100644
> --- sys/dev/pci/if_vio.c
> +++ sys/dev/pci/if_vio.c
> @@ -596,10 +596,13 @@ vio_attach(struct device *parent, struct device *self, 
> void *aux)
>   ifmedia_set(&sc->sc_media, IFM_ETHER | IFM_AUTO);
>   vsc->sc_config_change = vio_config_change;
>   timeout_set(&sc->sc_txtick, vio_txtick, &sc->sc_vq[VQTX]);
>   timeout_set(&sc->sc_rxtick, vio_rxtick, &sc->sc_vq[VQRX]);
>  
> + if_rxr_init(&sc->sc_rx_ring, 2 * ((ifp->if_hardmtu / MCLBYTES) + 1),
> + sc->sc_vq[VQRX].vq_num);
> +
>   if_attach(ifp);
>   ether_ifattach(ifp);
>  
>   return;
>  
> @@ -664,12 +667,10 @@ int
>  vio_init(struct ifnet *ifp)
>  {
>   struct vio_softc *sc = ifp->if_softc;
>  
>   vio_stop(ifp, 0);
> - if_rxr_init(&sc->sc_rx_ring, 2 * ((ifp->if_hardmtu / MCLBYTES) + 1),
> - sc->sc_vq[VQRX].vq_num);
>   vio_populate_rx_mbufs(sc);
>   ifp->if_flags |= IFF_RUNNING;
>   ifq_clr_oactive(&ifp->if_snd);
>   vio_iff(sc);
>   vio_link_state(ifp);
> @@ -689,21 +690,15 @@ vio_stop(struct ifnet *ifp, int disable)
>   /* only way to stop I/O and DMA is resetting... */
>   virtio_reset(vsc);
>   vio_rxeof(sc);
>   if (vsc->sc_nvqs >= 3)
>   vio_ctrleof(&sc->sc_vq[VQCTL]);
> - vio_tx_drain(sc);
> - if (disable)
> - vio_rx_drain(sc);
>  
> - virtio_reinit_start(vsc);
> - virtio_negotiate_features(vsc, vsc->sc_features, NULL);
>   virtio_start_vq_intr(vsc, &sc->sc_vq[VQRX]);
>   virtio_stop_vq_intr(vsc, &sc->sc_vq[VQTX]);
>   if (vsc->sc_nvqs >= 3)
>   virtio_start_vq_intr(vsc, &sc->sc_vq[VQCTL]);
> - virtio_reinit_end(vsc);
>   if (vsc->sc_nvqs >= 3) {
>   if (sc->sc_ctrl_inuse != FREE)
>   sc->sc_ctrl_inuse = RESET;
>   wakeup(&sc->sc_ctrl_inuse);
>   }
> 



smtpd: more internal cleanups

2016-11-22 Thread Eric Faurot
Hi.

When using the internal io api, the caller had to explicitely reset an
io event in some cases (basically when data is buffered outside of an
io callback context) which could easily be missed.  This has been the
cause of some of smtpd bugs in the past (hanging sessions).

After the recent changes, it's now possible to make it a lot simpler
by triggering an event reload internally when data is queued. So the
api user does not have to worry about it.

Eric.

Index: ioev.c
===
RCS file: /cvs/src/usr.sbin/smtpd/ioev.c,v
retrieving revision 1.31
diff -u -p -r1.31 ioev.c
--- ioev.c  22 Nov 2016 07:28:42 -  1.31
+++ ioev.c  22 Nov 2016 22:24:25 -
@@ -370,13 +370,25 @@ io_set_write(struct io *io)
 int
 io_write(struct io *io, const void *buf, size_t len)
 {
-   return iobuf_queue(io->iobuf, buf, len);
+   int r;
+
+   r = iobuf_queue(io->iobuf, buf, len);
+
+   io_reload(io);
+
+   return r;
 }
 
 int
 io_writev(struct io *io, const struct iovec *iov, int iovcount)
 {
-   return iobuf_queuev(io->iobuf, iov, iovcount);
+   int r;
+
+   r = iobuf_queuev(io->iobuf, iov, iovcount);
+
+   io_reload(io);
+
+   return r;
 }
 
 int
Index: mta_session.c
===
RCS file: /cvs/src/usr.sbin/smtpd/mta_session.c,v
retrieving revision 1.89
diff -u -p -r1.89 mta_session.c
--- mta_session.c   22 Nov 2016 07:28:42 -  1.89
+++ mta_session.c   22 Nov 2016 22:24:26 -
@@ -280,7 +280,6 @@ mta_session_imsg(struct mproc *p, struct
mta_flush_task(s, IMSG_MTA_DELIVERY_TEMPFAIL,
"Could not get message fd", 0, 0);
mta_enter_state(s, MTA_READY);
-   io_reload(&s->io);
return;
}
 
@@ -289,7 +288,6 @@ mta_session_imsg(struct mproc *p, struct
fatal("mta: fdopen");
 
mta_enter_state(s, MTA_MAIL);
-   io_reload(&s->io);
return;
 
case IMSG_MTA_DNS_PTR:
@@ -366,7 +364,6 @@ mta_session_imsg(struct mproc *p, struct
 
mta_tls_verified(s);
io_resume(&s->io, IO_PAUSE_IN);
-   io_reload(&s->io);
return;
 
case IMSG_MTA_LOOKUP_HELO:
@@ -456,7 +453,6 @@ mta_on_timeout(struct runq *runq, void *
s->hangon++;
 
mta_enter_state(s, MTA_READY);
-   io_reload(&s->io);
 }
 
 static void
Index: smtp_session.c
===
RCS file: /cvs/src/usr.sbin/smtpd/smtp_session.c,v
retrieving revision 1.295
diff -u -p -r1.295 smtp_session.c
--- smtp_session.c  22 Nov 2016 07:28:42 -  1.295
+++ smtp_session.c  22 Nov 2016 22:24:26 -
@@ -738,14 +738,12 @@ smtp_session_imsg(struct mproc *p, struc
smtp_filter_tx_rollback(s);
smtp_tx_free(s->tx);
smtp_reply(s, "%d %s", 530, "Sender rejected");
-   io_reload(&s->io);
break;
case LKA_TEMPFAIL:
smtp_filter_tx_rollback(s);
smtp_tx_free(s->tx);
smtp_reply(s, "421 %s: Temporary Error",
esc_code(ESC_STATUS_TEMPFAIL, 
ESC_OTHER_MAIL_SYSTEM_STATUS));
-   io_reload(&s->io);
break;
}
return;
@@ -766,7 +764,6 @@ smtp_session_imsg(struct mproc *p, struc
case LKA_TEMPFAIL:
smtp_reply(s, "%s", line);
}
-   io_reload(&s->io);
return;
 
case IMSG_SMTP_LOOKUP_HELO:
@@ -802,7 +799,6 @@ smtp_session_imsg(struct mproc *p, struc
smtp_enter_state(s, STATE_QUIT);
}
m_end(&m);
-   io_reload(&s->io);
return;
 
case IMSG_SMTP_MESSAGE_OPEN:
@@ -818,7 +814,6 @@ smtp_session_imsg(struct mproc *p, struc
smtp_reply(s, "421 %s: Temporary Error",
esc_code(ESC_STATUS_TEMPFAIL, 
ESC_OTHER_MAIL_SYSTEM_STATUS));
smtp_enter_state(s, STATE_QUIT);
-   io_reload(&s->io);
return;
}
 
@@ -872,7 +867,6 @@ smtp_session_imsg(struct mproc *p, struc
esc_code(ESC_STATUS_OK, 
ESC_DESTINATION_ADDRESS_VALID),
esc_description(ESC_DESTINATION_ADDRESS_VALID));
}
-   io_reload(&s->io);
return;
 
case IMSG_SMTP_MESSAGE_COMMIT:
@@ -887,7 +881,6 @@ smtp_session_imsg(struct mproc *p, struc
smtp_reply(s, "421 %s: Temporary failure",
esc_code(ESC_

Re: reloading pf through ansible easy hook

2016-11-22 Thread gwes

On 11/22/16 15:36, John Boeske wrote:

On Tue, Nov 22, 2016 at 10:46 AM, John Boeske wrote



I don't understand this philosophical point - why wouldn't you want
the rc.d framework to manage pf, quota, etc. whenever it's natural.
With pf, for example, it surely is.

One of the reasons I loved AIX's System Resource Controller (SRC)
was that it did present a unified management interface to all
system resources whether daemon or built in.



Using a consistent rc.d/rcctl framework to manage system services
and daemons - even pkg_add daemons - seems a good idea. Consistent
interfaces, fewer interfaces, less special-casing all simplify
management, thus minimize the chance of error and enhance security.

This is true whether management is local or through a remote tool
like ansible. Or?


Oops.  Meant "pkg_script daemons" above...

John


I designed a single-point-of-control management system for
AIX clusters. It was five software layers - three on the control
two on the controlled system. All of the layers were necessary
for reliable and reconfigurable operation.

As you may think it wasn't easy. It *did* work very well and
was configurable. It had to be since it rode on top of all
the normal Unix-like system control programs. It *was*
reconfigurable so it could track AIX updates (and IBM
requirement changes).

Before you advocate adding layers consider all the possible
error paths including hangs. Consider tracking interactions
between subsystems. Consider how you'll report
errors and status. To do it right isn't simple.

Geoff Steckel



Re: vio(4): fixup crash on up/down

2016-11-22 Thread Mike Larkin
On Tue, Nov 22, 2016 at 09:45:30PM +0100, Mike Belopuhov wrote:
> vmd hackers and users seem to be able to trigger the assertion
> below every time they do down and then up (with a dhclient for
> instance):
> 
>   panic: kernel diagnostic assertion "m != NULL" failed: file 
> "/usr/src/sys/dev/pci/if_vio.c", line 1008
>   Starting stack trace...
>   panic() at panic+0x10b
>   __assert() at __assert+0x25
>   vio_rxeof() at vio_rxeof+0x1db
>   vio_rx_intr() at vio_rx_intr+0x28
>   virtio_check_vqs() at virtio_check_vqs+0x8c
>   virtio_pci_legacy_intr() at virtio_pci_legacy_intr+0x71
>   intr_handler() at intr_handler+0x67
>   Xintr_legacy7() at Xintr_legacy7+0xdd
>   --- interrupt ---
> 
> The culprit is that vio_rx_drain and vio_tx_drain functions
> are not properly implemented: they m_freem mbufs attached to
> ring slots but do nothing about unconfiguring the backend
> virtqueue.  This results in us thinking that the ring is
> empty when we UP the interface the next time, but then a
> completion event arrives for the slot that doesn't have an
> associated mbuf anymore.
> 
> To fix this properly a virtqueue draining code has to be
> implemented, possibly based on what FreeBSD has.  However,
> there's nothing really wrong with leaving the rings the
> way they are and "resuming" on UP, vio doesn't destroy
> them anyway.
> 
> The other part of the diff is removing virtio_reinit_start
> and virtio_negotiate_features.  I'm not sure why does it
> make sense to re-negotiate features on stop, but reinit
> functions mess around the receive ring and if I just leave
> them there, the interface will not be operational on UP.
> 
> So far this has been tested under vmd by rzalamena@ and
> reyk@ but it would be nice to get it tested under the KVM.
> 

This has been on my to-do list to look at for a while, thanks a ton
mikeb for tackling this.

-ml

> 
> 
> diff --git sys/dev/pci/if_vio.c sys/dev/pci/if_vio.c
> index 97af2a2..96b5fdf 100644
> --- sys/dev/pci/if_vio.c
> +++ sys/dev/pci/if_vio.c
> @@ -596,10 +596,13 @@ vio_attach(struct device *parent, struct device *self, 
> void *aux)
>   ifmedia_set(&sc->sc_media, IFM_ETHER | IFM_AUTO);
>   vsc->sc_config_change = vio_config_change;
>   timeout_set(&sc->sc_txtick, vio_txtick, &sc->sc_vq[VQTX]);
>   timeout_set(&sc->sc_rxtick, vio_rxtick, &sc->sc_vq[VQRX]);
>  
> + if_rxr_init(&sc->sc_rx_ring, 2 * ((ifp->if_hardmtu / MCLBYTES) + 1),
> + sc->sc_vq[VQRX].vq_num);
> +
>   if_attach(ifp);
>   ether_ifattach(ifp);
>  
>   return;
>  
> @@ -664,12 +667,10 @@ int
>  vio_init(struct ifnet *ifp)
>  {
>   struct vio_softc *sc = ifp->if_softc;
>  
>   vio_stop(ifp, 0);
> - if_rxr_init(&sc->sc_rx_ring, 2 * ((ifp->if_hardmtu / MCLBYTES) + 1),
> - sc->sc_vq[VQRX].vq_num);
>   vio_populate_rx_mbufs(sc);
>   ifp->if_flags |= IFF_RUNNING;
>   ifq_clr_oactive(&ifp->if_snd);
>   vio_iff(sc);
>   vio_link_state(ifp);
> @@ -689,21 +690,15 @@ vio_stop(struct ifnet *ifp, int disable)
>   /* only way to stop I/O and DMA is resetting... */
>   virtio_reset(vsc);
>   vio_rxeof(sc);
>   if (vsc->sc_nvqs >= 3)
>   vio_ctrleof(&sc->sc_vq[VQCTL]);
> - vio_tx_drain(sc);
> - if (disable)
> - vio_rx_drain(sc);
>  
> - virtio_reinit_start(vsc);
> - virtio_negotiate_features(vsc, vsc->sc_features, NULL);
>   virtio_start_vq_intr(vsc, &sc->sc_vq[VQRX]);
>   virtio_stop_vq_intr(vsc, &sc->sc_vq[VQTX]);
>   if (vsc->sc_nvqs >= 3)
>   virtio_start_vq_intr(vsc, &sc->sc_vq[VQCTL]);
> - virtio_reinit_end(vsc);
>   if (vsc->sc_nvqs >= 3) {
>   if (sc->sc_ctrl_inuse != FREE)
>   sc->sc_ctrl_inuse = RESET;
>   wakeup(&sc->sc_ctrl_inuse);
>   }
> 



vio(4): fixup crash on up/down

2016-11-22 Thread Mike Belopuhov
vmd hackers and users seem to be able to trigger the assertion
below every time they do down and then up (with a dhclient for
instance):

  panic: kernel diagnostic assertion "m != NULL" failed: file 
"/usr/src/sys/dev/pci/if_vio.c", line 1008
  Starting stack trace...
  panic() at panic+0x10b
  __assert() at __assert+0x25
  vio_rxeof() at vio_rxeof+0x1db
  vio_rx_intr() at vio_rx_intr+0x28
  virtio_check_vqs() at virtio_check_vqs+0x8c
  virtio_pci_legacy_intr() at virtio_pci_legacy_intr+0x71
  intr_handler() at intr_handler+0x67
  Xintr_legacy7() at Xintr_legacy7+0xdd
  --- interrupt ---

The culprit is that vio_rx_drain and vio_tx_drain functions
are not properly implemented: they m_freem mbufs attached to
ring slots but do nothing about unconfiguring the backend
virtqueue.  This results in us thinking that the ring is
empty when we UP the interface the next time, but then a
completion event arrives for the slot that doesn't have an
associated mbuf anymore.

To fix this properly a virtqueue draining code has to be
implemented, possibly based on what FreeBSD has.  However,
there's nothing really wrong with leaving the rings the
way they are and "resuming" on UP, vio doesn't destroy
them anyway.

The other part of the diff is removing virtio_reinit_start
and virtio_negotiate_features.  I'm not sure why does it
make sense to re-negotiate features on stop, but reinit
functions mess around the receive ring and if I just leave
them there, the interface will not be operational on UP.

So far this has been tested under vmd by rzalamena@ and
reyk@ but it would be nice to get it tested under the KVM.



diff --git sys/dev/pci/if_vio.c sys/dev/pci/if_vio.c
index 97af2a2..96b5fdf 100644
--- sys/dev/pci/if_vio.c
+++ sys/dev/pci/if_vio.c
@@ -596,10 +596,13 @@ vio_attach(struct device *parent, struct device *self, 
void *aux)
ifmedia_set(&sc->sc_media, IFM_ETHER | IFM_AUTO);
vsc->sc_config_change = vio_config_change;
timeout_set(&sc->sc_txtick, vio_txtick, &sc->sc_vq[VQTX]);
timeout_set(&sc->sc_rxtick, vio_rxtick, &sc->sc_vq[VQRX]);
 
+   if_rxr_init(&sc->sc_rx_ring, 2 * ((ifp->if_hardmtu / MCLBYTES) + 1),
+   sc->sc_vq[VQRX].vq_num);
+
if_attach(ifp);
ether_ifattach(ifp);
 
return;
 
@@ -664,12 +667,10 @@ int
 vio_init(struct ifnet *ifp)
 {
struct vio_softc *sc = ifp->if_softc;
 
vio_stop(ifp, 0);
-   if_rxr_init(&sc->sc_rx_ring, 2 * ((ifp->if_hardmtu / MCLBYTES) + 1),
-   sc->sc_vq[VQRX].vq_num);
vio_populate_rx_mbufs(sc);
ifp->if_flags |= IFF_RUNNING;
ifq_clr_oactive(&ifp->if_snd);
vio_iff(sc);
vio_link_state(ifp);
@@ -689,21 +690,15 @@ vio_stop(struct ifnet *ifp, int disable)
/* only way to stop I/O and DMA is resetting... */
virtio_reset(vsc);
vio_rxeof(sc);
if (vsc->sc_nvqs >= 3)
vio_ctrleof(&sc->sc_vq[VQCTL]);
-   vio_tx_drain(sc);
-   if (disable)
-   vio_rx_drain(sc);
 
-   virtio_reinit_start(vsc);
-   virtio_negotiate_features(vsc, vsc->sc_features, NULL);
virtio_start_vq_intr(vsc, &sc->sc_vq[VQRX]);
virtio_stop_vq_intr(vsc, &sc->sc_vq[VQTX]);
if (vsc->sc_nvqs >= 3)
virtio_start_vq_intr(vsc, &sc->sc_vq[VQCTL]);
-   virtio_reinit_end(vsc);
if (vsc->sc_nvqs >= 3) {
if (sc->sc_ctrl_inuse != FREE)
sc->sc_ctrl_inuse = RESET;
wakeup(&sc->sc_ctrl_inuse);
}



Re: reloading pf through ansible easy hook

2016-11-22 Thread John Boeske
On Tue, Nov 22, 2016 at 10:46 AM, John Boeske wrote
> On Mon, Nov 21, 2016 at 3:48 PM, Antoine Jacoutet wrote
> > On Mon, Nov 21, 2016 at 05:34:35PM -0500, sven falempin wrote:
> > > Ansible is already managing pkg and service of openBSD , cool
> > >
> > > If one want to manage pf with it, and push or modify a few files,
> > > on must run - command: /sbin/pfctl -f {{ dank.config }}
> > >
> > > Yet - service could be use, if this glue was in the rc.d directory :
> >
> > You can easily create an ansible role|module to do that natively.
> > The rc.d framework is only meant to handle real daemons.
> > We don't want it to manage pf, quota, network, mounts...
> 
> I don't understand this philosophical point - why wouldn't you want
> the rc.d framework to manage pf, quota, etc. whenever it's natural.
> With pf, for example, it surely is.
> 
> One of the reasons I loved AIX's System Resource Controller (SRC)
> was that it did present a unified management interface to all
> system resources whether daemon or built in.

> Using a consistent rc.d/rcctl framework to manage system services
> and daemons - even pkg_add daemons - seems a good idea. Consistent
> interfaces, fewer interfaces, less special-casing all simplify
> management, thus minimize the chance of error and enhance security.
> 
> This is true whether management is local or through a remote tool
> like ansible. Or?

Oops.  Meant "pkg_script daemons" above...

John



Re: use DRM_IOCTL_GET_PCIINFO in libdrm

2016-11-22 Thread Mark Kettenis
> Date: Sat, 19 Nov 2016 19:31:18 +1100
> From: Jonathan Gray 
> 
> Support libdrm functions required for Mesa versions >= 13.
> 
> On linux this information is pulled out of a psuedo filesystem, here the
> new DRM_IOCTL_GET_PCIINFO ioctl is used for the same.
> 
> Only primary drm nodes are handled, render and control nodes which we
> don't have aren't.  This also only handles devices with PCI ids.
> 
> drmGetMinorNameForFD() based on code the Mesa loader used to have.

See my reply to the kernel diff.  One nit...

> diff --git xf86drm.c xf86drm.c
> index 3c2c5d4..03fe257 100644
> --- xf86drm.c
> +++ xf86drm.c
> @@ -62,6 +62,10 @@
>  #endif
>  #include 
>  
> +#ifdef __OpenBSD__
> +#include 
> +#endif
> +
>  /* Not all systems have MAP_FAILED defined */
>  #ifndef MAP_FAILED
>  #define MAP_FAILED ((void *)-1)
> @@ -2851,7 +2855,25 @@ static char *drmGetMinorNameForFD(int fd, int type)
>  out_close_dir:
>  closedir(sysdir);
>  #else
> -#warning "Missing implementation of drmGetMinorNameForFD"
> +struct stat sbuf;
> +unsigned int maj, min;
> +char buf[0x40];

Any reason not to use PATH_MAX + 1 here?

> +int n;
> +
> +if (fstat(fd, &sbuf))
> +return NULL;
> +
> +maj = major(sbuf.st_rdev);
> +min = minor(sbuf.st_rdev);
> +
> +if (maj != DRM_MAJOR || !S_ISCHR(sbuf.st_mode))
> +return NULL;
> +
> +n = snprintf(buf, sizeof(buf), DRM_DEV_NAME, DRM_DIR_NAME, min);
> +if (n == -1 || n >= sizeof(buf))
> +return NULL;
> +
> +return strdup(buf);
>  #endif
>  return NULL;
>  }
> @@ -2887,6 +2909,8 @@ static int drmParseSubsystemType(int maj, int min)
>  return DRM_BUS_PCI;
>  
>  return -EINVAL;
> +#elif defined(__OpenBSD__)
> + return DRM_BUS_PCI;
>  #else
>  #warning "Missing implementation of drmParseSubsystemType"
>  return -EINVAL;
> @@ -2929,6 +2953,26 @@ static int drmParsePciBusInfo(int maj, int min, 
> drmPciBusInfoPtr info)
>  info->func = func;
>  
>  return 0;
> +#elif defined(__OpenBSD__)
> +struct drm_pciinfo pinfo;
> +int fd;
> +
> +fd = drmOpenMinor(min, 0, DRM_NODE_PRIMARY);
> +if (fd < 0)
> +return -errno;
> +
> +if (drmIoctl(fd, DRM_IOCTL_GET_PCIINFO, &pinfo)) {
> +close(fd);
> +return -errno;
> +}
> +close(fd);
> +
> +info->domain = pinfo.domain;
> +info->bus = pinfo.bus;
> +info->dev = pinfo.dev;
> +info->func = pinfo.func;
> +
> +return 0;
>  #else
>  #warning "Missing implementation of drmParsePciBusInfo"
>  return -EINVAL;
> @@ -3004,6 +3048,37 @@ static int drmParsePciDeviceInfo(const char *d_name,
>  device->subdevice_id = config[46] | (config[47] << 8);
>  
>  return 0;
> +#elif defined(__OpenBSD__)
> +struct drm_pciinfo pinfo;
> +char buf[0x40];

And here?

> +int fd, n;
> +
> +n = snprintf(buf, sizeof(buf), "%s/%s", DRM_DIR_NAME, d_name);
> +if (n == -1 || n >= sizeof(buf))
> +return -errno;
> +
> +#ifndef X_PRIVSEP
> +fd = open(buf, O_RDWR, 0);
> +#else
> +fd = priv_open_device(buf);
> +#endif
> +
> +if (fd < 0)
> +return -errno;
> +
> +if (drmIoctl(fd, DRM_IOCTL_GET_PCIINFO, &pinfo)) {
> +close(fd);
> +return -errno;
> +}
> +close(fd);
> +
> +device->vendor_id = pinfo.vendor_id;
> +device->device_id = pinfo.device_id;
> +device->revision_id = pinfo.revision_id;
> +device->subvendor_id = pinfo.subvendor_id;
> +device->subdevice_id = pinfo.subdevice_id;
> +
> +return 0;
>  #else
>  #warning "Missing implementation of drmParsePciDeviceInfo"
>  return -EINVAL;
> @@ -3117,6 +3192,46 @@ static void drmFoldDuplicatedDevices(drmDevicePtr 
> local_devices[], int count)
>   */
>  int drmGetDevice(int fd, drmDevicePtr *device)
>  {
> +#ifdef __OpenBSD__
> +drmDevicePtr d;
> +struct stat sbuf;
> +char node[PATH_MAX + 1];
> +char d_name[PATH_MAX + 1];
> +int maj, min, n;
> +int ret;
> +int max_count = 1;
> +
> +if (fd == -1 || device == NULL)
> +return -EINVAL;
> +
> +if (fstat(fd, &sbuf))
> +return -errno;
> +
> +maj = major(sbuf.st_rdev);
> +min = minor(sbuf.st_rdev);
> +
> +if (maj != DRM_MAJOR || !S_ISCHR(sbuf.st_mode))
> +return -EINVAL;
> +
> +n = snprintf(d_name, PATH_MAX, "drm%d", min);
> +if (n == -1 || n >= PATH_MAX)
> +  return -errno;
> +
> +n = snprintf(node, PATH_MAX, DRM_DEV_NAME, DRM_DIR_NAME, min);
> +if (n == -1 || n >= PATH_MAX)
> +  return -errno;
> +if (stat(node, &sbuf))
> +return -EINVAL;
> +
> +ret = drmProcessPciDevice(&d, d_name, node, DRM_NODE_PRIMARY,
> +   maj, min, true);
> +if (ret)
> +return ret;
> +
> +*device = d;
> +
> +return 0;
> +#else
>  drmDevicePtr *local_devices;
>  drmDevicePtr d;
>  DIR *sysdir;
> @@ -3224,6 +3339,7 @@ free_devices:
>  free_locals:
>  free(local_devices);
>  return ret;
> +#endif
> 

Re: add a new DRM_IOCTL_GET_PCIINFO ioctl

2016-11-22 Thread Mark Kettenis
> Date: Sat, 19 Nov 2016 19:27:25 +1100
> From: Jonathan Gray 
> 
> To pull pci information from the kernel for drm devices we need a common
> drm ioctl.  This is a requirement for implementing functions in libdrm
> which are used by Mesa >= 13.
> 
> To not clash with drm headers this is added via pciio.h at kettenis'
> suggestion.
> 
> The ioctl number reuses that of DRM_IOCTL_ADD_MAP, a DRI1 ioctl
> we dropped support for, to avoid using a number that might be later
> used in linux.

Sorry for dropping the ball on this.  My original thought was that we
would have a 'p' ioctl instead of a 'd' ioctl with a name that is more
in line with the existing ioctls in .  Having a #define
for DRM_IOCTL_GET_PCIINFO in  feels wrong.

Looking at your diffs, I wonder if we shouldn't just add the
DRM_IOCTL_GET_PCIINFO define to  for the kernel,
and put a copy (protectected by #ifdef __OpenBSD__) in
libdrm/xf86drm.c.

The implementation looks good, but...

> Index: dev/pci/drm/drm_drv.c
> ===
> RCS file: /cvs/src/sys/dev/pci/drm/drm_drv.c,v
> retrieving revision 1.149
> diff -u -p -r1.149 drm_drv.c
> --- dev/pci/drm/drm_drv.c 15 Sep 2016 02:00:17 -  1.149
> +++ dev/pci/drm/drm_drv.c 19 Nov 2016 07:12:06 -
> @@ -50,6 +50,7 @@
>  #include 
>  #include  /* for TIOCSGRP */
>  #include 
> +#include  /* for DRM_IOCTL_GET_PCIINFO */
>  
>  #include 
>  #include 
> @@ -81,6 +82,7 @@ int  drm_version(struct drm_device *, vo
>  int   drm_setversion(struct drm_device *, void *, struct drm_file *);
>  int   drm_getmagic(struct drm_device *, void *, struct drm_file *);
>  int   drm_authmagic(struct drm_device *, void *, struct drm_file *);
> +int   drm_getpciinfo(struct drm_device *, void *, struct drm_file *);
>  int   drm_file_cmp(struct drm_file *, struct drm_file *);
>  SPLAY_PROTOTYPE(drm_file_tree, drm_file, link, drm_file_cmp);
>  
> @@ -120,6 +122,8 @@ static struct drm_ioctl_desc drm_ioctls[
>   DRM_IOCTL_DEF(DRM_IOCTL_SET_SAREA_CTX, drm_setsareactx, 
> DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
>   DRM_IOCTL_DEF(DRM_IOCTL_GET_SAREA_CTX, drm_getsareactx, DRM_AUTH),
>  #else
> + DRM_IOCTL_DEF(DRM_IOCTL_GET_PCIINFO, drm_getpciinfo, 0),
> +
>   DRM_IOCTL_DEF(DRM_IOCTL_SET_SAREA_CTX, drm_noop, 
> DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
>   DRM_IOCTL_DEF(DRM_IOCTL_GET_SAREA_CTX, drm_noop, DRM_AUTH),
>  #endif
> @@ -254,11 +258,12 @@ pledge_ioctl_drm(struct proc *p, long co
>   if (ioctl->flags & DRM_RENDER_ALLOW)
>   return 0;
>  
> + switch (com) {
> + case DRM_IOCTL_GET_PCIINFO:

...I think we should just mark DRM_IOCTL_GET_PCIINFO as DRM_RENDER_ALLOW.

>   /*
>* These are dangerous, but we have to allow them until we
>* have prime/dma-buf support.
>*/
> - switch (com) {
>   case DRM_IOCTL_GET_MAGIC:
>   case DRM_IOCTL_GEM_OPEN:
>   return 0;
> @@ -1345,5 +1350,23 @@ int drm_pcie_get_speed_cap_mask(struct d
>  
>   DRM_INFO("probing gen 2 caps for device 0x%04x:0x%04x = %x/%x\n",
>   PCI_VENDOR(id), PCI_PRODUCT(id), lnkcap, lnkcap2);
> + return 0;
> +}
> +
> +int
> +drm_getpciinfo(struct drm_device *dev, void *data, struct drm_file 
> *file_priv)
> +{
> + struct drm_pciinfo *info = data;
> +
> + info->domain = 0;
> + info->bus = dev->pdev->bus->number;
> + info->dev = PCI_SLOT(dev->pdev->devfn);
> + info->func = PCI_FUNC(dev->pdev->devfn);
> + info->vendor_id = dev->pdev->vendor;
> + info->device_id = dev->pdev->device;
> + info->subvendor_id = dev->pdev->subsystem_vendor;
> + info->subdevice_id = dev->pdev->subsystem_device;
> + info->revision_id = 0;
> +
>   return 0;
>  }
> Index: sys/pciio.h
> ===
> RCS file: /cvs/src/sys/sys/pciio.h,v
> retrieving revision 1.7
> diff -u -p -r1.7 pciio.h
> --- sys/pciio.h   5 Sep 2010 18:14:33 -   1.7
> +++ sys/pciio.h   19 Nov 2016 07:12:06 -
> @@ -60,6 +60,18 @@ struct pci_vga {
>   int pv_decode;
>  };
>  
> +struct drm_pciinfo {
> + uint16_tdomain;
> + uint8_t bus;
> + uint8_t dev;
> + uint8_t func;
> + uint16_tvendor_id;
> + uint16_tdevice_id;
> + uint16_tsubvendor_id;
> + uint16_tsubdevice_id;
> + uint8_t revision_id;
> +};
> +
>  #define  PCI_VGA_UNLOCK  0x00
>  #define  PCI_VGA_LOCK0x01
>  #define  PCI_VGA_TRYLOCK 0x02
> @@ -74,5 +86,7 @@ struct pci_vga {
>  #define  PCIOCGETVGA _IOWR('p', 6, struct pci_vga)
>  #define  PCIOCSETVGA _IOWR('p', 7, struct pci_vga)
>  #define  PCIOCREADMASK   _IOWR('p', 8, struct pci_io)
> +
> +#define  DRM_IOCTL_GET_PCIINFO   _IOR('d', 0x15, struct drm_pciinfo)
>  
>  #endif /* !_SYS_PCIIO_H_ */
> 
> 



Re: reloading pf through ansible easy hook

2016-11-22 Thread John Boeske
On Mon, Nov 21, 2016 at 3:48 PM, Antoine Jacoutet wrote
> On Mon, Nov 21, 2016 at 05:34:35PM -0500, sven falempin wrote:
> > Ansible is already managing pkg and service of openBSD , cool
> >
> > If one want to manage pf with it, and push or modify a few files,
> > on must run - command: /sbin/pfctl -f {{ dank.config }}
> >
> > Yet - service could be use, if this glue was in the rc.d directory :
>
> You can easily create an ansible role|module to do that natively.
> The rc.d framework is only meant to handle real daemons.
> We don't want it to manage pf, quota, network, mounts...

I don't understand this philosophical point - why wouldn't you want
the rc.d framework to manage pf, quota, etc. whenever it's natural.
With pf, for example, it surely is.

One of the reasons I loved AIX's System Resource Controller (SRC) 
was that it did present a unified management interface to all
system resources whether daemon or built in.

Using a consistent rc.d/rcctl framework to manage system services 
and daemons - even pkg_add daemons - seems a good idea. Consistent 
interfaces, fewer interfaces, less special-casing all simplify 
management, thus minimize the chance of error and enhance security.
 
This is true whether management is local or through a remote tool
like ansible. Or?

John



Unnecessary goto in ip_output()

2016-11-22 Thread Martin Pieuchot
After the last IPSEC-related refactoring this goto no longer make sense.

ok?

Index: netinet/ip_output.c
===
RCS file: /cvs/src/sys/netinet/ip_output.c,v
retrieving revision 1.330
diff -u -p -r1.330 ip_output.c
--- netinet/ip_output.c 18 Nov 2016 02:53:47 -  1.330
+++ netinet/ip_output.c 22 Nov 2016 15:51:30 -
@@ -249,11 +249,6 @@ reroute:
 * computation now.
 */
in_proto_cksum_out(m, NULL);
-
-   /* If it's not a multicast packet, try to fast-path */
-   if (!IN_MULTICAST(ip->ip_dst.s_addr)) {
-   goto sendit;
-   }
}
}
 #endif /* IPSEC */



Re: switchd(8): negotiate versions with hello

2016-11-22 Thread Reyk Floeter

> On 22.11.2016, at 16:12, Rafael Zalamena  wrote:
> 
> Teach switchd(8) how to negotiate protocol version using the hello bitmap
> header. This way switchd(8) is able to fallback or use higher version using
> the bitmap.
> 
> This diff also prevents connections from switching version in the middle of
> the operation.
> 
> This is the first step before adding a state machine to switchd(8): move the
> hello to a common function and make it step into the first state: HELLO_WAIT.
> (next diff)
> 
> ok?

Two comments below, otherwise OK.

Reyk

> 
> Index: ofp.c
> ===
> RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp.c,v
> retrieving revision 1.15
> diff -u -p -r1.15 ofp.c
> --- ofp.c 4 Nov 2016 22:27:08 -   1.15
> +++ ofp.c 22 Nov 2016 14:55:12 -
> @@ -132,6 +132,13 @@ ofp_input(struct switch_connection *con,
>   return (-1);
>   }
> 
> + if (con->con_version != OFP_V_0 &&

This should be handled by the state machine later, but OK for now.

(<= HELLO_WAIT accepts hello and any version, > HELLO_WAIT doesn't)

> + oh->oh_version != con->con_version) {
> + log_debug("wrong version %d, expected %d",
> + oh->oh_version, con->con_version);
> + return (-1);
> + }
> +
>   switch (oh->oh_version) {
>   case OFP_V_1_0:
>   if (ofp10_input(sc, con, oh, ibuf) != 0)
> @@ -165,6 +172,10 @@ ofp_open(struct privsep *ps, struct swit
>   log_info("%s: new connection %u.%u from switch %u",
>   __func__, con->con_id, con->con_instance,
>   sw == NULL ? 0 : sw->sw_id);
> +
> + /* Send the hello with the latest version we support. */
> + if (ofp_send_hello(ps->ps_env, con, OFP_V_1_3) == -1)
> + return (-1);
> 
>   return (0);
> }
> Index: ofp10.c
> ===
> RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp10.c,v
> retrieving revision 1.16
> diff -u -p -r1.16 ofp10.c
> --- ofp10.c   21 Nov 2016 18:19:51 -  1.16
> +++ ofp10.c   22 Nov 2016 15:08:02 -
> @@ -60,7 +60,7 @@ int  ofp10_validate_packet_out(struct sw
>   struct ofp_header *, struct ibuf *);
> 
> struct ofp_callback ofp10_callbacks[] = {
> - { OFP10_T_HELLO,ofp10_hello, NULL },
> + { OFP10_T_HELLO,ofp10_hello, ofp_validate_hello },
>   { OFP10_T_ERROR,NULL, ofp10_validate_error },
>   { OFP10_T_ECHO_REQUEST, ofp10_echo_request, NULL },
>   { OFP10_T_ECHO_REPLY,   NULL, NULL },
> @@ -262,13 +262,8 @@ ofp10_hello(struct switchd *sc, struct s
>   return (-1);
>   }
> 
> - /* Echo back the received Hello packet */
> - oh->oh_version = OFP_V_1_0;
> - oh->oh_length = htons(sizeof(*oh));
> - oh->oh_xid = htonl(con->con_xidnxt++);
> - if (ofp10_validate(sc, &con->con_local, &con->con_peer, oh, NULL) != 0)
> + if (ofp_recv_hello(sc, con, oh, ibuf) == -1)
>   return (-1);
> - ofp_output(con, oh, NULL);
> 
> #if 0
>   (void)write(fd, &oh, sizeof(oh));
> Index: ofp13.c
> ===
> RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp13.c,v
> retrieving revision 1.39
> diff -u -p -r1.39 ofp13.c
> --- ofp13.c   21 Nov 2016 19:33:12 -  1.39
> +++ ofp13.c   22 Nov 2016 14:49:27 -
> @@ -109,7 +109,7 @@ intofp13_tablemiss_sendctrl(struct swi
>   uint8_t);
> 
> struct ofp_callback ofp13_callbacks[] = {
> - { OFP_T_HELLO,  ofp13_hello, NULL },
> + { OFP_T_HELLO,  ofp13_hello, ofp_validate_hello },
>   { OFP_T_ERROR,  NULL, ofp13_validate_error },
>   { OFP_T_ECHO_REQUEST,   ofp13_echo_request, NULL },
>   { OFP_T_ECHO_REPLY, NULL, NULL },
> @@ -639,13 +639,8 @@ ofp13_hello(struct switchd *sc, struct s
>   return (-1);
>   }
> 
> - /* Echo back the received Hello packet */
> - oh->oh_version = OFP_V_1_3;
> - oh->oh_length = htons(sizeof(*oh));
> - oh->oh_xid = htonl(con->con_xidnxt++);
> - if (ofp13_validate(sc, &con->con_local, &con->con_peer, oh, NULL) != 0)
> + if (ofp_recv_hello(sc, con, oh, ibuf) == -1)
>   return (-1);
> - ofp_output(con, oh, NULL);
> 
>   /* Ask for switch features so we can get more information. */
>   if (ofp13_featuresrequest(sc, con) == -1)
> Index: ofp_common.c
> ===
> RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp_common.c,v
> retrieving revision 1.7
> diff -u -p -r1.7 ofp_common.c
> --- ofp_common.c  17 Nov 2016 13:10:26 -  1.7
> +++ ofp_common.c  22 Nov 2016 14:53:45 -
> @@ -43,6 +43,8 @@
> #include "switchd.h"
> #include "ofp_map.h"
> 
> +int  ofp_setversion(struct switch_connection *, int);
> +
> int

ip_input tweak

2016-11-22 Thread Martin Pieuchot
Keep blocks corresponding to the delivery path together.  This helps
when looking for differences with IPv6 and will make it easier to merge
multicast checks.

ok?

Index: netinet/ip_input.c
===
RCS file: /cvs/src/sys/netinet/ip_input.c,v
retrieving revision 1.285
diff -u -p -r1.285 ip_input.c
--- netinet/ip_input.c  14 Nov 2016 04:27:03 -  1.285
+++ netinet/ip_input.c  22 Nov 2016 13:55:48 -
@@ -357,6 +357,12 @@ ipv4_input(struct mbuf *m)
goto out;
}
 
+   if (ip->ip_dst.s_addr == INADDR_BROADCAST ||
+   ip->ip_dst.s_addr == INADDR_ANY) {
+   ip_ours(m);
+   goto out;
+   }
+
if (in_ouraddr(m, ifp, &rt)) {
ip_ours(m);
goto out;
@@ -421,12 +427,6 @@ ipv4_input(struct mbuf *m)
ipstat_inc(ips_cantforward);
goto bad;
}
-   ip_ours(m);
-   goto out;
-   }
-
-   if (ip->ip_dst.s_addr == INADDR_BROADCAST ||
-   ip->ip_dst.s_addr == INADDR_ANY) {
ip_ours(m);
goto out;
}



ip6_input() refactoring

2016-11-22 Thread Martin Pieuchot
This diff merges two "#ifdef MROUTING" blocks.  It's one more step
towards splitting ip6_input() in two.  Which is a requirement to
unlock the forwarding path without messing with the receiving path.

ok?

Index: netinet6/ip6_input.c
===
RCS file: /cvs/src/sys/netinet6/ip6_input.c,v
retrieving revision 1.169
diff -u -p -r1.169 ip6_input.c
--- netinet6/ip6_input.c14 Nov 2016 10:32:46 -  1.169
+++ netinet6/ip6_input.c22 Nov 2016 14:18:52 -
@@ -387,7 +387,6 @@ ip6_input(struct mbuf *m)
 * Multicast check
 */
if (IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) {
-
/*
 * Make sure M_MCAST is set.  It should theoretically
 * already be there, but let's play safe because upper
@@ -401,12 +400,36 @@ ip6_input(struct mbuf *m)
 */
if (in6_hasmulti(&ip6->ip6_dst, ifp))
ours = 1;
+
 #ifdef MROUTING
-   else if (!ip6_mforwarding || !ip6_mrouter)
-#else
-   else
+   if (ip6_mforwarding && ip6_mrouter) {
+   if (ip6_hbhchcheck(m, &off, &nxt, &ours)) {
+   rtfree(rt);
+   if_put(ifp);
+   return; /* m have already been freed */
+   }
+
+   ip6 = mtod(m, struct ip6_hdr *);
+
+   /*
+* If we are acting as a multicast router, all
+* incoming multicast packets are passed to the
+* kernel-level multicast forwarding function.
+* The packet is returned (relatively) intact; if
+* ip6_mforward() returns a non-zero value, the packet
+* must be discarded, else it may be accepted below.
+*/
+   if (ip6_mforward(ip6, ifp, m)) {
+   ip6stat.ip6s_cantforward++;
+   goto bad;
+   }
+
+   if (!ours)
+   goto bad;
+   goto ours;
+   }
 #endif
-   {
+   if (!ours) {
ip6stat.ip6s_notmember++;
if (!IN6_IS_ADDR_MC_LINKLOCAL(&ip6->ip6_dst))
ip6stat.ip6s_cantforward++;
@@ -485,30 +508,14 @@ ip6_input(struct mbuf *m)
/*
 * Forward if desirable.
 */
-   if (IN6_IS_ADDR_MULTICAST(&ip6->ip6_dst)) {
-   /*
-* If we are acting as a multicast router, all
-* incoming multicast packets are passed to the
-* kernel-level multicast forwarding function.
-* The packet is returned (relatively) intact; if
-* ip6_mforward() returns a non-zero value, the packet
-* must be discarded, else it may be accepted below.
-*/
-#ifdef MROUTING
-   if (ip6_mforwarding && ip6_mrouter &&
-   ip6_mforward(ip6, ifp, m)) {
-   ip6stat.ip6s_cantforward++;
-   goto bad;
-   }
-#endif
-   if (!ours)
-   goto bad;
-   } else if (!ours) {
+   if (!ours) {
ip6_forward(m, rt, srcrt);
if_put(ifp);
return;
}
-
+#ifdef MROUTING
+  ours:
+#endif
/* pf might have changed things */
in6_proto_cksum_out(m, NULL);
 



switchd(8): negotiate versions with hello

2016-11-22 Thread Rafael Zalamena
Teach switchd(8) how to negotiate protocol version using the hello bitmap
header. This way switchd(8) is able to fallback or use higher version using
the bitmap.

This diff also prevents connections from switching version in the middle of
the operation.

This is the first step before adding a state machine to switchd(8): move the
hello to a common function and make it step into the first state: HELLO_WAIT.
(next diff)

ok?

Index: ofp.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp.c,v
retrieving revision 1.15
diff -u -p -r1.15 ofp.c
--- ofp.c   4 Nov 2016 22:27:08 -   1.15
+++ ofp.c   22 Nov 2016 14:55:12 -
@@ -132,6 +132,13 @@ ofp_input(struct switch_connection *con,
return (-1);
}
 
+   if (con->con_version != OFP_V_0 &&
+   oh->oh_version != con->con_version) {
+   log_debug("wrong version %d, expected %d",
+   oh->oh_version, con->con_version);
+   return (-1);
+   }
+
switch (oh->oh_version) {
case OFP_V_1_0:
if (ofp10_input(sc, con, oh, ibuf) != 0)
@@ -165,6 +172,10 @@ ofp_open(struct privsep *ps, struct swit
log_info("%s: new connection %u.%u from switch %u",
__func__, con->con_id, con->con_instance,
sw == NULL ? 0 : sw->sw_id);
+
+   /* Send the hello with the latest version we support. */
+   if (ofp_send_hello(ps->ps_env, con, OFP_V_1_3) == -1)
+   return (-1);
 
return (0);
 }
Index: ofp10.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp10.c,v
retrieving revision 1.16
diff -u -p -r1.16 ofp10.c
--- ofp10.c 21 Nov 2016 18:19:51 -  1.16
+++ ofp10.c 22 Nov 2016 15:08:02 -
@@ -60,7 +60,7 @@ intofp10_validate_packet_out(struct sw
struct ofp_header *, struct ibuf *);
 
 struct ofp_callback ofp10_callbacks[] = {
-   { OFP10_T_HELLO,ofp10_hello, NULL },
+   { OFP10_T_HELLO,ofp10_hello, ofp_validate_hello },
{ OFP10_T_ERROR,NULL, ofp10_validate_error },
{ OFP10_T_ECHO_REQUEST, ofp10_echo_request, NULL },
{ OFP10_T_ECHO_REPLY,   NULL, NULL },
@@ -262,13 +262,8 @@ ofp10_hello(struct switchd *sc, struct s
return (-1);
}
 
-   /* Echo back the received Hello packet */
-   oh->oh_version = OFP_V_1_0;
-   oh->oh_length = htons(sizeof(*oh));
-   oh->oh_xid = htonl(con->con_xidnxt++);
-   if (ofp10_validate(sc, &con->con_local, &con->con_peer, oh, NULL) != 0)
+   if (ofp_recv_hello(sc, con, oh, ibuf) == -1)
return (-1);
-   ofp_output(con, oh, NULL);
 
 #if 0
(void)write(fd, &oh, sizeof(oh));
Index: ofp13.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp13.c,v
retrieving revision 1.39
diff -u -p -r1.39 ofp13.c
--- ofp13.c 21 Nov 2016 19:33:12 -  1.39
+++ ofp13.c 22 Nov 2016 14:49:27 -
@@ -109,7 +109,7 @@ int  ofp13_tablemiss_sendctrl(struct swi
uint8_t);
 
 struct ofp_callback ofp13_callbacks[] = {
-   { OFP_T_HELLO,  ofp13_hello, NULL },
+   { OFP_T_HELLO,  ofp13_hello, ofp_validate_hello },
{ OFP_T_ERROR,  NULL, ofp13_validate_error },
{ OFP_T_ECHO_REQUEST,   ofp13_echo_request, NULL },
{ OFP_T_ECHO_REPLY, NULL, NULL },
@@ -639,13 +639,8 @@ ofp13_hello(struct switchd *sc, struct s
return (-1);
}
 
-   /* Echo back the received Hello packet */
-   oh->oh_version = OFP_V_1_3;
-   oh->oh_length = htons(sizeof(*oh));
-   oh->oh_xid = htonl(con->con_xidnxt++);
-   if (ofp13_validate(sc, &con->con_local, &con->con_peer, oh, NULL) != 0)
+   if (ofp_recv_hello(sc, con, oh, ibuf) == -1)
return (-1);
-   ofp_output(con, oh, NULL);
 
/* Ask for switch features so we can get more information. */
if (ofp13_featuresrequest(sc, con) == -1)
Index: ofp_common.c
===
RCS file: /home/obsdcvs/src/usr.sbin/switchd/ofp_common.c,v
retrieving revision 1.7
diff -u -p -r1.7 ofp_common.c
--- ofp_common.c17 Nov 2016 13:10:26 -  1.7
+++ ofp_common.c22 Nov 2016 14:53:45 -
@@ -43,6 +43,8 @@
 #include "switchd.h"
 #include "ofp_map.h"
 
+intofp_setversion(struct switch_connection *, int);
+
 int
 ofp_validate_header(struct switchd *sc,
 struct sockaddr_storage *src, struct sockaddr_storage *dst,
@@ -114,6 +116,177 @@ ofp_output(struct switch_connection *con
}
 
ofrelay_write(con, buf);
+
+   return (0);
+}
+
+int
+ofp_send_hello(struct switchd *sc, struct switch_connection *con, int version)
+{
+   stru

Re: reloading pf through ansible easy hook

2016-11-22 Thread BARDOU Pierre
I know the official validate command is pfctl -nf, but if you do so, you need 
to register the result of this task, then make one more conditional task to 
apply.
This doubles your playbook execution time, which is not acceptable for me.

--
Cordialement,
Pierre BARDOU


-Message d'origine-
De : owner-t...@openbsd.org [mailto:owner-t...@openbsd.org] De la part de 
Landry Breuil
Envoyé : mardi 22 novembre 2016 14:53
À : tech@openbsd.org
Objet : Re: reloading pf through ansible easy hook

On Tue, Nov 22, 2016 at 11:15:01AM +, BARDOU Pierre wrote:
> Hello,
> 
> - name: "Loading pf.conf"
>   template: src=pf.conf dest=/etc/ validate="pfctl -f %s"

Fwiw, i find it nicer to validate with 'pfctl -nf' ..

Landry



Re: reloading pf through ansible easy hook

2016-11-22 Thread Landry Breuil
On Tue, Nov 22, 2016 at 11:15:01AM +, BARDOU Pierre wrote:
> Hello,
> 
> - name: "Loading pf.conf"
>   template: src=pf.conf dest=/etc/ validate="pfctl -f %s"

Fwiw, i find it nicer to validate with 'pfctl -nf' ..

Landry



Re: pf af-to route output

2016-11-22 Thread Mike Belopuhov
On 22 November 2016 at 13:35, Alexander Bluhm  wrote:
> On Mon, Nov 21, 2016 at 10:47:34PM +0100, Mike Belopuhov wrote:
>> On 21 November 2016 at 22:38, Alexandr Nedvedicky
>> > The bluhm's change should not alter behavior of older code.
>> Yes, it just adds something new.
>
> I did not try to add something new, I have preserved what was there
> in pf_route().  I have moved the "if (!r->rt)" from pf_route() to
> the "case PF_AFRT" in pf_test().  Now it is more obvious what is
> happening and we ask ourselves "does it work?".  I have not tested
> it.
>

Not exactly.  You're now performing two output actions, while only
one was done before.  If r->rt was specified it was no longer a valid
af-to usage.



Re: pf af-to route output

2016-11-22 Thread Mike Belopuhov
On 22 November 2016 at 13:35, Alexander Bluhm  wrote:
> On Mon, Nov 21, 2016 at 10:47:34PM +0100, Mike Belopuhov wrote:
>> On 21 November 2016 at 22:38, Alexandr Nedvedicky
>> > The bluhm's change should not alter behavior of older code.
>> Yes, it just adds something new.
>
> I did not try to add something new, I have preserved what was there
> in pf_route().  I have moved the "if (!r->rt)" from pf_route() to
> the "case PF_AFRT" in pf_test().  Now it is more obvious what is
> happening and we ask ourselves "does it work?".  I have not tested
> it.
>
> The parser does not accpet the obvious thing:
> pass in on net1 inet af-to inet6 from 2001:db8::1 to 2001:db8::/96 route-to 
> 2001:db8::1@net0
>
> This might actually work:
> pass in inet all flags S/SA af-to inet6 from 0.0.0.0 dup-to em0
> Although pfctl prints the from 0.0.0.0 in the wrong af.
>
> The parser accepts this, but I doubt that pf will create a valid
> IPv6 packet with the 1.2.3.4 address.
> pass in inet all flags S/SA af-to inet6 from 0.0.0.0 dup-to 1.2.3.4@em0
>
> So we have a kernel implementation that might work for a feature
> that makes sense.  Especially the reply-to could be useful.  But
> the parser is too dumb.  I think we should fix the parser and then
> test the kernel.
>
> bluhm

OK, all I wanted to know was if this is know to work and if it has
been tested.  I'd argue that we don't put the code that doesn't work
or not tested or we don't know what it does :)

When af-to was devised, it wasn't meant to be compatible with route-to
because they do the same thing: policy routing.  I didn't think about
dup-to back then, but this one makes sense to me.  However dup-to is
now operating on the translated packet and pfctl parser might not
expect that if destination address is provided.  Should it be fixed or
simultaneous specification of dup-to/route-to/reply-to and af-to
should be prohibited?



Re: pf af-to route output

2016-11-22 Thread Alexandr Nedvedicky
> So we have a kernel implementation that might work for a feature
> that makes sense.  Especially the reply-to could be useful.  But
> the parser is too dumb.  I think we should fix the parser and then
> test the kernel.

I absolutely agree.

regards
sasha



Re: pf af-to route output

2016-11-22 Thread Alexander Bluhm
On Mon, Nov 21, 2016 at 10:47:34PM +0100, Mike Belopuhov wrote:
> On 21 November 2016 at 22:38, Alexandr Nedvedicky
> > The bluhm's change should not alter behavior of older code.
> Yes, it just adds something new.

I did not try to add something new, I have preserved what was there
in pf_route().  I have moved the "if (!r->rt)" from pf_route() to
the "case PF_AFRT" in pf_test().  Now it is more obvious what is
happening and we ask ourselves "does it work?".  I have not tested
it.

The parser does not accpet the obvious thing:
pass in on net1 inet af-to inet6 from 2001:db8::1 to 2001:db8::/96 route-to 
2001:db8::1@net0

This might actually work:
pass in inet all flags S/SA af-to inet6 from 0.0.0.0 dup-to em0
Although pfctl prints the from 0.0.0.0 in the wrong af.

The parser accepts this, but I doubt that pf will create a valid
IPv6 packet with the 1.2.3.4 address.
pass in inet all flags S/SA af-to inet6 from 0.0.0.0 dup-to 1.2.3.4@em0

So we have a kernel implementation that might work for a feature
that makes sense.  Especially the reply-to could be useful.  But
the parser is too dumb.  I think we should fix the parser and then
test the kernel.

bluhm



umb string padding

2016-11-22 Thread Gerhard Roth
This patch fixes a bug in the padding of umb strings. Instead of
padding the right position, umb_padding() would always zero padding
bytes at the beginning of the buffer.

For the two callers of umb_addstr(), this won't hurt in
umb_send_connect() since the first value in the buffer is the
session id, which is zero anyway. But for umb_setpin() we might try to
change the wrong type of PIN, iff the length of the PIN is not a
multiple of 4.

Gerhard


Index: sys/dev/usb/if_umb.c
===
RCS file: /cvs/src/sys/dev/usb/if_umb.c,v
retrieving revision 1.7
diff -u -p -u -p -r1.7 if_umb.c
--- sys/dev/usb/if_umb.c21 Nov 2016 08:19:36 -  1.7
+++ sys/dev/usb/if_umb.c22 Nov 2016 11:41:49 -
@@ -1211,7 +1211,7 @@ umb_padding(void *data, int len, size_t 
int  np = 0;
 
while (len < sz && (len % 4) != 0) {
-   *p++ = '\0';
+   *(p + len) = '\0';
len++;
np++;
}



Re: ifioctl, pr_ctlinput, pr_slowtimo & pr_fasttimo

2016-11-22 Thread Alexander Bluhm
On Tue, Nov 22, 2016 at 11:39:32AM +0100, Martin Pieuchot wrote:
> On 21/11/16(Mon) 20:07, Alexander Bluhm wrote:
> > [...] 
> > NFS hits you again.  nfs_boot_init() calls ifioctl().  Perhaps put
> > the splsoftnet() inside ifioctl().
> 
> I'll put it inside nfs_boot() because the function also iterate on
> structures that need to be protected.  Note that this should not matter
> during boot, but we might have to play tricks if we put too many
> asserts.

OK bluhm@

> 
> Index: kern/sys_socket.c
> ===
> RCS file: /cvs/src/sys/kern/sys_socket.c,v
> retrieving revision 1.24
> diff -u -p -r1.24 sys_socket.c
> --- kern/sys_socket.c 21 Nov 2016 10:30:42 -  1.24
> +++ kern/sys_socket.c 22 Nov 2016 10:30:18 -
> @@ -119,8 +119,12 @@ soo_ioctl(struct file *fp, u_long cmd, c
>* interface and routing ioctls should have a
>* different entry since a socket's unnecessary
>*/
> - if (IOCGROUP(cmd) == 'i')
> - return (ifioctl(so, cmd, data, p));
> + if (IOCGROUP(cmd) == 'i') {
> + s = splsoftnet();
> + error = ifioctl(so, cmd, data, p);
> + splx(s);
> + return (error);
> + }
>   if (IOCGROUP(cmd) == 'r')
>   return (EOPNOTSUPP);
>   s = splsoftnet();
> Index: nfs/nfs_boot.c
> ===
> RCS file: /cvs/src/sys/nfs/nfs_boot.c,v
> retrieving revision 1.39
> diff -u -p -r1.39 nfs_boot.c
> --- nfs/nfs_boot.c1 Sep 2015 21:24:04 -   1.39
> +++ nfs/nfs_boot.c22 Nov 2016 10:36:36 -
> @@ -122,7 +122,7 @@ nfs_boot_init(struct nfs_diskless *nd, s
>   struct socket *so;
>   struct ifaddr *ifa;
>   char addr[INET_ADDRSTRLEN];
> - int error;
> + int s, error;
>  
>   /*
>* Find an interface, rarp for its ip address, stuff it, the
> @@ -159,11 +159,15 @@ nfs_boot_init(struct nfs_diskless *nd, s
>*/
>   if ((error = socreate(AF_INET, &so, SOCK_DGRAM, 0)) != 0)
>   panic("nfs_boot: socreate, error=%d", error);
> + s = splsoftnet();
>   error = ifioctl(so, SIOCGIFFLAGS, (caddr_t)&ireq, procp);
> + splx(s);
>   if (error)
>   panic("nfs_boot: GIFFLAGS, error=%d", error);
>   ireq.ifr_flags |= IFF_UP;
> + s = splsoftnet();
>   error = ifioctl(so, SIOCSIFFLAGS, (caddr_t)&ireq, procp);
> + splx(s);
>   if (error)
>   panic("nfs_boot: SIFFLAGS, error=%d", error);
>  
> @@ -186,7 +190,9 @@ nfs_boot_init(struct nfs_diskless *nd, s
>   sin->sin_len = sizeof(*sin);
>   sin->sin_family = AF_INET;
>   sin->sin_addr.s_addr = my_ip.s_addr;
> + s = splsoftnet();
>   error = ifioctl(so, SIOCAIFADDR, (caddr_t)&ifra, procp);
> + splx(s);
>   if (error)
>   panic("nfs_boot: set if addr, error=%d", error);
>  



Re: global mbuf memory limit

2016-11-22 Thread Mark Kettenis
> Date: Tue, 22 Nov 2016 12:42:39 +1000
> From: David Gwynne 
> 
> right now pools that make up mbufs are each limited individually.
> 
> the following diff instead has the mbuf layer have a global limit
> on the amount of memory that can be allocated to the pools. this
> is enforced by wrapping the multi page pool allocator with something
> that checks the mbuf memory limit first.
> 
> this means all mbufs will use a max of 2k * nmbclust bytes instead
> of each pool being able to use that amount each.
> 
> ok?

Mostly makes sense to me.  Not sure the complixty of copying the
supported page sizes from the multi-page pool allocator is worth the
additional complication.  I'd probably just initialize it the same way
using POOL_ALLOC_SIZES(PAGE_SIZE, 1UL<<31, POOL_ALLOC_ALIGNED).

Wouldn't it make sense to use atomic operations to keep track of the
amount of memory that was allocated?

Long run I suppose we want to drop nmbclust and let users tune the
total amount of memory available for clusters and set the initial
amount to a percentage of physical memory?


> Index: sys/pool.h
> ===
> RCS file: /cvs/src/sys/sys/pool.h,v
> retrieving revision 1.68
> diff -u -p -r1.68 pool.h
> --- sys/pool.h21 Nov 2016 01:44:06 -  1.68
> +++ sys/pool.h22 Nov 2016 02:31:47 -
> @@ -205,6 +205,7 @@ struct pool {
>  #ifdef _KERNEL
>  
>  extern struct pool_allocator pool_allocator_single;
> +extern struct pool_allocator pool_allocator_multi;
>  
>  struct pool_request {
>   TAILQ_ENTRY(pool_request) pr_entry;
> Index: sys/mbuf.h
> ===
> RCS file: /cvs/src/sys/sys/mbuf.h,v
> retrieving revision 1.222
> diff -u -p -r1.222 mbuf.h
> --- sys/mbuf.h24 Oct 2016 04:38:44 -  1.222
> +++ sys/mbuf.h22 Nov 2016 02:31:47 -
> @@ -416,6 +416,7 @@ struct mbuf_queue {
>  };
>  
>  #ifdef   _KERNEL
> +struct pool;
>  
>  extern   int nmbclust;   /* limit on the # of clusters */
>  extern   int mblowat;/* mbuf low water mark */
> @@ -444,6 +445,7 @@ int   m_leadingspace(struct mbuf *);
>  int  m_trailingspace(struct mbuf *);
>  struct mbuf *m_clget(struct mbuf *, int, u_int);
>  void m_extref(struct mbuf *, struct mbuf *);
> +void m_pool_init(struct pool *, u_int, u_int, const char *);
>  void m_extfree_pool(caddr_t, u_int, void *);
>  void m_adj(struct mbuf *, int);
>  int  m_copyback(struct mbuf *, int, int, const void *, int);
> Index: kern/uipc_mbuf.c
> ===
> RCS file: /cvs/src/sys/kern/uipc_mbuf.c,v
> retrieving revision 1.238
> diff -u -p -r1.238 uipc_mbuf.c
> --- kern/uipc_mbuf.c  9 Nov 2016 08:55:11 -   1.238
> +++ kern/uipc_mbuf.c  22 Nov 2016 02:31:47 -
> @@ -133,6 +133,19 @@ void m_extfree(struct mbuf *);
>  void nmbclust_update(void);
>  void m_zero(struct mbuf *);
>  
> +struct mutex m_pool_mtx = MUTEX_INITIALIZER(IPL_NET);
> +unsigned int mbuf_mem_limit; /* how much memory can we allocated */
> +unsigned int mbuf_mem_alloc; /* how much memory has been allocated */
> +
> +void *m_pool_alloc(struct pool *, int, int *);
> +void m_pool_free(struct pool *, void *);
> +
> +struct pool_allocator m_pool_allocator = {
> + m_pool_alloc,
> + m_pool_free,
> + 0 /* will be copied from pool_allocator_multi */
> +};
> +
>  static void (*mextfree_fns[4])(caddr_t, u_int, void *);
>  static u_int num_extfree_fns;
>  
> @@ -148,6 +161,11 @@ mbinit(void)
>   int i;
>   unsigned int lowbits;
>  
> + m_pool_allocator.pa_pagesz = pool_allocator_multi.pa_pagesz;
> +
> + nmbclust_update();
> + mbuf_mem_alloc = 0;
> +
>  #if DIAGNOSTIC
>   if (mclsizes[0] != MCLBYTES)
>   panic("mbinit: the smallest cluster size != MCLBYTES");
> @@ -155,9 +173,7 @@ mbinit(void)
>   panic("mbinit: the largest cluster size != MAXMCLBYTES");
>  #endif
>  
> - pool_init(&mbpool, MSIZE, 0, IPL_NET, 0, "mbufpl", NULL);
> - pool_set_constraints(&mbpool, &kp_dma_contig);
> - pool_setlowat(&mbpool, mblowat);
> + m_pool_init(&mbpool, MSIZE, 64, "mbufpl");
>  
>   pool_init(&mtagpool, PACKET_TAG_MAXSIZE + sizeof(struct m_tag), 0,
>   IPL_NET, 0, "mtagpl", NULL);
> @@ -171,47 +187,32 @@ mbinit(void)
>   snprintf(mclnames[i], sizeof(mclnames[0]), "mcl%dk",
>   mclsizes[i] >> 10);
>   }
> - pool_init(&mclpools[i], mclsizes[i], 64, IPL_NET, 0,
> - mclnames[i], NULL);
> - pool_set_constraints(&mclpools[i], &kp_dma_contig);
> - pool_setlowat(&mclpools[i], mcllowat);
> +
> + m_pool_init(&mclpools[i], mclsizes[i], 64, mclnames[i]);
>   }
>  
>   (void)mextfree_register(m_extfree_pool);
>   KASSERT(num_extfree_fns == 1);
> -
> - nmbclust_update();
>  }
>  
>  voi

Re: reloading pf through ansible easy hook

2016-11-22 Thread BARDOU Pierre
Hello,

- name: "Loading pf.conf"
  template: src=pf.conf dest=/etc/ validate="pfctl -f %s"

Works fine for me.
Configuration is copied and loaded if correct, otherwise the rule file is not 
modified and not loaded (and the playbook fails with error).

--
Cordialement,
Pierre BARDOU

-Message d'origine-
De : owner-t...@openbsd.org [mailto:owner-t...@openbsd.org] De la part de 
Antoine Jacoutot
Envoyé : lundi 21 novembre 2016 23:48
À : sven falempin 
Cc : tech@openbsd.org
Objet : Re: reloading pf through ansible easy hook

On Mon, Nov 21, 2016 at 05:34:35PM -0500, sven falempin wrote:
> Ansible is already managing pkg and service of openBSD , cool
> 
> If one want to manage pf with it, and push or modify a few files, on 
> must run - command: /sbin/pfctl -f {{ dank.config }}
> 
> Yet - service could be use, if this glue was in the rc.d directory :

You can easily create an ansible role|module to do that natively.
The rc.d framework is only meant to handle real daemons.
We don't want it to manage pf, quota, network, mounts...

--
Antoine



Re: only free pool pages from the gc task

2016-11-22 Thread Mark Kettenis
> Date: Tue, 22 Nov 2016 12:45:44 +1000
> From: David Gwynne 
> 
> at the moment pages can be freed on a pool_put call and from the gc.
> 
> it is a bit unfair that pool_get may end up doing the heavy lifting
> of allocating a pool page and pool_put wont have to do an equivalent
> free, but we should try and minimise the amount of work done in
> these hot paths.
> 
> ok?

A potentially serious downside of this approach is that this makes us
rely on a thread to run in order to free up memory.  That may be ok
now that most of the network stack runs out of a thread.  But I'm not
confident (yet) that this is safe.

Best not to pile this immediately onto the other pool changes.

> Index: subr_pool.c
> ===
> RCS file: /cvs/src/sys/kern/subr_pool.c,v
> retrieving revision 1.204
> diff -u -p -r1.204 subr_pool.c
> --- subr_pool.c   21 Nov 2016 01:44:06 -  1.204
> +++ subr_pool.c   22 Nov 2016 02:43:23 -
> @@ -707,7 +707,7 @@ void
>  pool_put(struct pool *pp, void *v)
>  {
>   struct pool_item *pi = v;
> - struct pool_page_header *ph, *freeph = NULL;
> + struct pool_page_header *ph;
>  
>  #ifdef DIAGNOSTIC
>   if (v == NULL)
> @@ -770,17 +770,7 @@ pool_put(struct pool *pp, void *v)
>   pp->pr_nout--;
>   pp->pr_nput++;
>  
> - /* is it time to free a page? */
> - if (pp->pr_nidle > pp->pr_maxpages &&
> - (ph = TAILQ_FIRST(&pp->pr_emptypages)) != NULL &&
> - (ticks - ph->ph_tick) > (hz * pool_wait_free)) {
> - freeph = ph;
> - pool_p_remove(pp, freeph);
> - }
>   mtx_leave(&pp->pr_mtx);
> -
> - if (freeph != NULL)
> - pool_p_free(pp, freeph);
>  
>   if (!TAILQ_EMPTY(&pp->pr_requests)) {
>   mtx_enter(&pp->pr_requests_mtx);
> 
> 



Re: bpf_mtap(9) w/o KERNEL_LOCK

2016-11-22 Thread Martin Pieuchot
On 13/09/16(Tue) 12:23, Martin Pieuchot wrote:
> Here's the big scary diff I've been using for some months now to stop
> grabbing the KERNEL_LOCK() in bpf_mtap(9).  This has been originally
> written to prevent lock ordering inside pf_test().  Now that we're
> heading toward using a rwlock, we won't have this problem, but fewer
> usages of KERNEL_LOCK() is still interesting.
> 
> I'm going to split this diff in small chunks to ease the review.  But
> I'd appreciate if people could try to break it, test & report back.
> 
> Some notes:
> 
>   - Now that selwakeup() is called in a thread context (task) we only
> rely on the KERNEL_LOCK() to serialize access to kqueue(9) data.
> 
>   - The reference counting is here to make sure a descriptor is not
> freed during a sleep.  That's why the KERNEL_LOCK() is still
> necessary in the slow path.  On the other hand bpf_catchpacket()
> relies on the reference guaranteed by the SRP list.
> 
>   - A mutex now protects the rotating buffers and their associated
> fields.  It is dropped before calling ifpromisc() because USB
> devices sleep.
> 
>   - The dance around uiomove(9) is here to check that buffers aren't
> rotated while data is copied to userland.  Setting ``b->bd_fbuf''
> to NULL should be enough to let bpf_catchpacket() to drop the
> patcket.  But I added ``__in_uiomove'' to be able to have usable
> panic if something weird happen.

Next extract diff that tweaks bpf_detachd().

The goal here is to remove ``d'' from the list and NULLify ``d->bd_bif''
before calling ifpromisc().

The reason is that ifpromisc() can sleep.  Think USB ;)  So we'll have to
release the mutex before calling it.  So we want to make sure ``d'' is no
longer in the interface descriptor list at this point.

ok?

Index: net/bpf.c
===
RCS file: /cvs/src/sys/net/bpf.c,v
retrieving revision 1.154
diff -u -p -r1.154 bpf.c
--- net/bpf.c   21 Nov 2016 09:15:40 -  1.154
+++ net/bpf.c   22 Nov 2016 10:47:05 -
@@ -288,6 +288,23 @@ bpf_detachd(struct bpf_d *d)
struct bpf_if *bp;
 
bp = d->bd_bif;
+   /* Not attached. */
+   if (bp == NULL)
+   return;
+
+   /* Remove ``d'' from the interface's descriptor list. */
+   KERNEL_ASSERT_LOCKED();
+   SRPL_REMOVE_LOCKED(&bpf_d_rc, &bp->bif_dlist, d, bpf_d, bd_next);
+
+   if (SRPL_EMPTY_LOCKED(&bp->bif_dlist)) {
+   /*
+* Let the driver know that there are no more listeners.
+*/
+   *bp->bif_driverp = NULL;
+   }
+
+   d->bd_bif = NULL;
+
/*
 * Check if this descriptor had requested promiscuous mode.
 * If so, turn it off.
@@ -305,19 +322,6 @@ bpf_detachd(struct bpf_d *d)
 */
panic("bpf: ifpromisc failed");
}
-
-   /* Remove d from the interface's descriptor list. */
-   KERNEL_ASSERT_LOCKED();
-   SRPL_REMOVE_LOCKED(&bpf_d_rc, &bp->bif_dlist, d, bpf_d, bd_next);
-
-   if (SRPL_EMPTY_LOCKED(&bp->bif_dlist)) {
-   /*
-* Let the driver know that there are no more listeners.
-*/
-   *d->bd_bif->bif_driverp = 0;
-   }
-
-   d->bd_bif = NULL;
 }
 
 void
@@ -372,8 +376,7 @@ bpfclose(dev_t dev, int flag, int mode, 
 
d = bpfilter_lookup(minor(dev));
s = splnet();
-   if (d->bd_bif)
-   bpf_detachd(d);
+   bpf_detachd(d);
bpf_wakeup(d);
LIST_REMOVE(d, bd_list);
bpf_put(d);
@@ -1033,12 +1036,10 @@ bpf_setif(struct bpf_d *d, struct ifreq 
goto out;
}
if (candidate != d->bd_bif) {
-   if (d->bd_bif)
-   /*
-* Detach if attached to something else.
-*/
-   bpf_detachd(d);
-
+   /*
+* Detach if attached to something else.
+*/
+   bpf_detachd(d);
bpf_attachd(d, candidate);
}
bpf_resetd(d);



Remove many splsoftnet() in ioctl(2) path

2016-11-22 Thread Martin Pieuchot
As soon as we ensure that ifioctl() is always called at IPL_SOFTNET
we can remove all of these.

Recursions still exist, but this is a good step forward.

Index: net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.462
diff -u -p -r1.462 if.c
--- net/if.c21 Nov 2016 09:09:06 -  1.462
+++ net/if.c21 Nov 2016 10:31:30 -
@@ -133,6 +133,8 @@
 void   if_attachsetup(struct ifnet *);
 void   if_attachdomain(struct ifnet *);
 void   if_attach_common(struct ifnet *);
+intif_setrdomain(struct ifnet *, int);
+void   if_slowtimo(void *);
 
 void   if_detached_start(struct ifnet *);
 intif_detached_ioctl(struct ifnet *, u_long, caddr_t);
@@ -402,6 +404,8 @@ if_attachsetup(struct ifnet *ifp)
 {
unsigned long ifidx;
 
+   splsoftassert(IPL_SOFTNET);
+
TAILQ_INIT(&ifp->if_groups);
 
if_addgroup(ifp, IFG_ALL);
@@ -501,17 +505,25 @@ if_attachdomain(struct ifnet *ifp)
 void
 if_attachhead(struct ifnet *ifp)
 {
+   int s;
+
+   s = splsoftnet();
if_attach_common(ifp);
TAILQ_INSERT_HEAD(&ifnet, ifp, if_list);
if_attachsetup(ifp);
+   splx(s);
 }
 
 void
 if_attach(struct ifnet *ifp)
 {
+   int s;
+
+   s = splsoftnet();
if_attach_common(ifp);
TAILQ_INSERT_TAIL(&ifnet, ifp, if_list);
if_attachsetup(ifp);
+   splx(s);
 }
 
 void
@@ -1036,7 +1048,9 @@ if_clone_create(const char *name, int rd
 {
struct if_clone *ifc;
struct ifnet *ifp;
-   int unit, ret, s;
+   int unit, ret;
+
+   splsoftassert(IPL_SOFTNET);
 
ifc = if_clone_lookup(name, &unit);
if (ifc == NULL)
@@ -1045,9 +1059,7 @@ if_clone_create(const char *name, int rd
if (ifunit(name) != NULL)
return (EEXIST);
 
-   s = splsoftnet();
ret = (*ifc->ifc_create)(ifc, unit);
-   splx(s);
 
if (ret != 0 || (ifp = ifunit(name)) == NULL)
return (ret);
@@ -1067,7 +1079,8 @@ if_clone_destroy(const char *name)
 {
struct if_clone *ifc;
struct ifnet *ifp;
-   int error, s;
+
+   splsoftassert(IPL_SOFTNET);
 
ifc = if_clone_lookup(name, NULL);
if (ifc == NULL)
@@ -1081,15 +1094,13 @@ if_clone_destroy(const char *name)
return (EOPNOTSUPP);
 
if (ifp->if_flags & IFF_UP) {
+   int s;
s = splnet();
if_down(ifp);
splx(s);
}
 
-   s = splsoftnet();
-   error = (*ifc->ifc_destroy)(ifp);
-   splx(s);
-   return (error);
+   return ((*ifc->ifc_destroy)(ifp));
 }
 
 /*
@@ -1594,7 +1605,7 @@ int
 if_setrdomain(struct ifnet *ifp, int rdomain)
 {
struct ifreq ifr;
-   int s, error;
+   int error;
 
if (rdomain < 0 || rdomain > RT_TABLEID_MAX)
return (EINVAL);
@@ -1618,11 +1629,8 @@ if_setrdomain(struct ifnet *ifp, int rdo
if (error && (ifp != loifp || error != EEXIST))
return (error);
 
-
-   s = splsoftnet();
if ((error = rtable_add(rdomain)) == 0)
rtable_l2set(rdomain, rdomain, loifp->if_index);
-   splx(s);
if (error) {
if_clone_destroy(loifname);
return (error);
@@ -1638,6 +1646,8 @@ if_setrdomain(struct ifnet *ifp, int rdo
/* remove all routing entries when switching domains */
/* XXX this is a bit ugly */
if (rdomain != ifp->if_rdomain) {
+   int s;
+
s = splnet();
/*
 * We are tearing down the world.
@@ -1725,20 +1735,15 @@ ifioctl(struct socket *so, u_long cmd, c
switch (ifar->ifar_af) {
case AF_INET:
/* attach is a noop for AF_INET */
-   if (cmd == SIOCIFAFDETACH) {
-   s = splsoftnet();
+   if (cmd == SIOCIFAFDETACH)
in_ifdetach(ifp);
-   splx(s);
-   }
return (0);
 #ifdef INET6
case AF_INET6:
-   s = splsoftnet();
if (cmd == SIOCIFAFATTACH)
error = in6_ifattach(ifp);
else
in6_ifdetach(ifp);
-   splx(s);
return (error);
 #endif /* INET6 */
default:
@@ -1804,9 +1809,7 @@ ifioctl(struct socket *so, u_long cmd, c
 
 #ifdef INET6
if (ISSET(ifr->ifr_flags, IFXF_AUTOCONF6)) {
-   s = splsoftnet();
error = in6_ifattach(ifp);
-   splx(s);
if (error != 0)
return (error);
}
@@ 

Re: ifioctl, pr_ctlinput, pr_slowtimo & pr_fasttimo

2016-11-22 Thread Martin Pieuchot
On 21/11/16(Mon) 20:07, Alexander Bluhm wrote:
> [...] 
> NFS hits you again.  nfs_boot_init() calls ifioctl().  Perhaps put
> the splsoftnet() inside ifioctl().

I'll put it inside nfs_boot() because the function also iterate on
structures that need to be protected.  Note that this should not matter
during boot, but we might have to play tricks if we put too many
asserts.

Index: kern/sys_socket.c
===
RCS file: /cvs/src/sys/kern/sys_socket.c,v
retrieving revision 1.24
diff -u -p -r1.24 sys_socket.c
--- kern/sys_socket.c   21 Nov 2016 10:30:42 -  1.24
+++ kern/sys_socket.c   22 Nov 2016 10:30:18 -
@@ -119,8 +119,12 @@ soo_ioctl(struct file *fp, u_long cmd, c
 * interface and routing ioctls should have a
 * different entry since a socket's unnecessary
 */
-   if (IOCGROUP(cmd) == 'i')
-   return (ifioctl(so, cmd, data, p));
+   if (IOCGROUP(cmd) == 'i') {
+   s = splsoftnet();
+   error = ifioctl(so, cmd, data, p);
+   splx(s);
+   return (error);
+   }
if (IOCGROUP(cmd) == 'r')
return (EOPNOTSUPP);
s = splsoftnet();
Index: nfs/nfs_boot.c
===
RCS file: /cvs/src/sys/nfs/nfs_boot.c,v
retrieving revision 1.39
diff -u -p -r1.39 nfs_boot.c
--- nfs/nfs_boot.c  1 Sep 2015 21:24:04 -   1.39
+++ nfs/nfs_boot.c  22 Nov 2016 10:36:36 -
@@ -122,7 +122,7 @@ nfs_boot_init(struct nfs_diskless *nd, s
struct socket *so;
struct ifaddr *ifa;
char addr[INET_ADDRSTRLEN];
-   int error;
+   int s, error;
 
/*
 * Find an interface, rarp for its ip address, stuff it, the
@@ -159,11 +159,15 @@ nfs_boot_init(struct nfs_diskless *nd, s
 */
if ((error = socreate(AF_INET, &so, SOCK_DGRAM, 0)) != 0)
panic("nfs_boot: socreate, error=%d", error);
+   s = splsoftnet();
error = ifioctl(so, SIOCGIFFLAGS, (caddr_t)&ireq, procp);
+   splx(s);
if (error)
panic("nfs_boot: GIFFLAGS, error=%d", error);
ireq.ifr_flags |= IFF_UP;
+   s = splsoftnet();
error = ifioctl(so, SIOCSIFFLAGS, (caddr_t)&ireq, procp);
+   splx(s);
if (error)
panic("nfs_boot: SIFFLAGS, error=%d", error);
 
@@ -186,7 +190,9 @@ nfs_boot_init(struct nfs_diskless *nd, s
sin->sin_len = sizeof(*sin);
sin->sin_family = AF_INET;
sin->sin_addr.s_addr = my_ip.s_addr;
+   s = splsoftnet();
error = ifioctl(so, SIOCAIFADDR, (caddr_t)&ifra, procp);
+   splx(s);
if (error)
panic("nfs_boot: set if addr, error=%d", error);
 



Re: small indentation and spelling diff

2016-11-22 Thread Alexander Bluhm
On Mon, Nov 21, 2016 at 05:36:21PM -0700, Kyle Milz wrote:
> -  * everywhere. It wouldn't surprise me if several stacks accidently
> +  * everywhere. It wouldn't surprise me if several stacks accidentally

/usr/share/dict/words has both variants.

bluhm



Re: only free pool pages from the gc task

2016-11-22 Thread Martin Pieuchot
On 22/11/16(Tue) 12:45, David Gwynne wrote:
> [...] 
> it is a bit unfair that pool_get may end up doing the heavy lifting
> of allocating a pool page and pool_put wont have to do an equivalent
> free, but we should try and minimise the amount of work done in
> these hot paths.

Any number, analysis or reference to back your claim?