Re: [PATCH] add Gemini Lake SoC pcidevs

2018-12-13 Thread Heppler, J. Scott

I have an HP Stream 14 with an n4000 Gemini Lake mobile processor.
The amd64_current does not find the eMMC storage
Would it be of value to the project to apply the patch, generate an
install image using release(8), test and submit the dmesg?

--
J. Scott Heppler



pci/i386: Fix typo

2018-12-13 Thread Christian Ludwig
Extents code has its own set of flags and does not use malloc's.
The current code passes in EX_FAST by accident, which does no harm.
That's probably why nobody stumbled upon it, yet.
---
 sys/arch/i386/pci/pci_machdep.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sys/arch/i386/pci/pci_machdep.c b/sys/arch/i386/pci/pci_machdep.c
index b712ff242ed..bfd8055f125 100644
--- a/sys/arch/i386/pci/pci_machdep.c
+++ b/sys/arch/i386/pci/pci_machdep.c
@@ -915,7 +915,7 @@ pci_init_extents(void)
NULL, 0, EX_NOWAIT | EX_FILLED);
if (pciio_ex == NULL)
return;
-   extent_free(pciio_ex, 0, 0x1, M_NOWAIT);
+   extent_free(pciio_ex, 0, 0x1, EX_NOWAIT);
}
 
if (pcimem_ex == NULL) {
-- 
2.19.2



Re: pci/i386: Fix typo

2018-12-13 Thread Mark Kettenis
> From: Christian Ludwig 
> Cc: Christian Ludwig 
> Date: Thu, 13 Dec 2018 17:44:47 +0100
> 
> Extents code has its own set of flags and does not use malloc's.
> The current code passes in EX_FAST by accident, which does no harm.
> That's probably why nobody stumbled upon it, yet.
> ---
>  sys/arch/i386/pci/pci_machdep.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

ok kettenis@

> diff --git a/sys/arch/i386/pci/pci_machdep.c b/sys/arch/i386/pci/pci_machdep.c
> index b712ff242ed..bfd8055f125 100644
> --- a/sys/arch/i386/pci/pci_machdep.c
> +++ b/sys/arch/i386/pci/pci_machdep.c
> @@ -915,7 +915,7 @@ pci_init_extents(void)
>   NULL, 0, EX_NOWAIT | EX_FILLED);
>   if (pciio_ex == NULL)
>   return;
> - extent_free(pciio_ex, 0, 0x1, M_NOWAIT);
> + extent_free(pciio_ex, 0, 0x1, EX_NOWAIT);
>   }
>  
>   if (pcimem_ex == NULL) {
> -- 
> 2.19.2
> 
> 



Re: request for testing: patch for boot loader out of mem

2018-12-13 Thread Ted Unangst
Otto Moerbeek wrote:
>  int
>  biosd_diskio(int rw, struct diskinfo *dip, u_int off, int nsect, void *buf)
>  {
> - return biosd_io(rw, &dip->bios_info, off, nsect, buf);
> + int i, n, ret;
> +
> + /*
> +  * Avoid doing too large reads, the bounce buffer used by biosd_io()
> +  * might run us out-of-mem.
> +  */
> + for (i = 0, ret = 0; ret == 0 && nsect > 0;
> + i += MAXSECTS, nsect -= MAXSECTS) {
> + n = nsect >= MAXSECTS ? MAXSECTS : nsect;
> + ret = biosd_io(rw, &dip->bios_info, off + i, n,
> + buf + i * DEV_BSIZE);

you're doing pointer arithmetic on a void here. needs a cast.

or perhaps something like this, eliminating i.

char *dest = buf;

for (ret = 0; ret == 0; off += MAXSECTS, dest += MAXSECTS * DEV_BSIZE)



Re: relayd and TLS client cert verification

2018-12-13 Thread Ashe Connor


> On 6 Dec 2018, at 16:17, Ashe Connor  wrote:
> 
> It's been a week or so, so bumping.  (Benno was kind enough to offer a
> review but was time-poor recently.)

Another friendly ping.  I'd love to do some more work on relayd but only if 
it's desirable/worth someone's time to review.

Cheers,

Ashe



Re: let etherip(4) output directly to the ip stack

2018-12-13 Thread David Gwynne



> On 13 Dec 2018, at 00:03, Martin Pieuchot  wrote:
> 
> On 12/12/18(Wed) 12:03, David Gwynne wrote:
>> with the previous if_ethersubr.c diff, this allows etherip(4) to output
>> directly to the network stack.
> 
> What do you mean with "directly"?
> 
> To my understanding ip{,6}_etherip_output() call ip{,6}_send() to enqueue
> packets.

You guess below.

> 
>> direct output relies on the interface using priq, since hfsc uses the
>> ifq machinery to work. priq implies you dont want to delay packets, so
>> it lets etherip push the packet straight through.
> 
> So "direct output" means skip the queue(s)?

The ifq specifically, yes.

> So are we heading towards a design where ifq_enqueue(9) is going to be
> bypassed for pseudo-drivers unless people use HFSC?  If that's the case
> I'm afraid that code path will rot.

I don't have a good answer to that, except to say that bridge maybe be popular 
enough to keep pushing packets through if_enqueue().

> 
>> i dont think the ifq stuff is slow, but it does allow the stack to
>> concurrently send packets with etherip without having to create an ifq
>> per cpu. it does rely on per cpu counters, but theyre relatively small.
> 
> I don't understand how it allows the stack to concurrently send packets.
> All packets seems to be serialized in a single task when ip{,6}_send()
> is called.

In the etherip(4) case right now, yes. Presumably ip{,6}_send() will be fixed 
one day though, and etherip and co will be able to take advantage of it.

Similar treatment for vlan(4) has more immediate benefit.

> 
>> the other advantage is it stops ether_output recursing in the etherip
>> path. this helps profiling, but that is a fairly niche benefit so maybe
>> just focus on the concurrency.
> 
> I don't understand.  How is ether_output() recursing?

Hrm. Yes. ether_output doesn't recurse. Thinking about it, maybe I meant 
ifq_start.

> 
>> ok?
> 
> I'm a bit sad about where this is going.  Because the design you adopted
> will forces us to modify all pseudo-drivers.  Apart the fact that you're
> now breaking etherip(4) into bridge(4) /!\, I believe we can do better.

It doesn't force anything. Hopefully there's some performance benefit from 
changes like this to pseudo drivers, but there's no technical code change 
requirement because of it.

However, you are right about bridge, which this diff will break, so let's drop 
it for now.
> 
> We had this design:
> 
>   STACK   DRIVER
> 
> ether_output() -> if_enqueue() -> ifq_enqueue() + ifq_start() | drv_start()
>   `.  |
> `--> bridge_output()  |

Sigh. Why does ethernet get tentacles into the stack like this?

> The problem that you want to solve, if I'm not mistaken, is to remove
> the useless stack/driver separation for pseudo-drivers.  In other words
> stop queueing & delaying packets.  This only makes sense for "physical"
> nics where mbufs are put into DMA buckets, right?

There's two benefits to ifqs on physical nics which do not apply to pseudo 
interfaces. Real interfaces have things like rings, and access to those rings 
requires serialisation. ifq provides that serialisation so many cpus can try 
and queue packets on the ring, but ifq makes sure that only one of the cpus 
does the work. vlan(4) and etherip(4) don't have a single thing that they need 
to serialise access to, they just need to put a header on and push it down.

The second benefit is that it gives us a nice place to do tx mitigation. The 
cost of putting a packet on the wire is shared between the actually filling in 
descriptors on a ring, and then actually notifying the chip that the ring has 
work to do. Notifying the chip is surprisingly expensive, so it is better to 
amortise that cost by putting multiple packets on the ring before notifying the 
chip. We aren't doing this currently, but ifqs are the right place to do this. 
There's no chip notification overhead on pseudo interfaces, so applying tx mit 
to them just adds latency, and likely unpredictable latency as packets bubble 
down interfaces.

> In that case I'd suggest modifying if_enqueue() or ifq_enqueue() to treat
> IFXF_CLONED devices as special if they are not using HSFC.

I'll think about it. My first thought is that it is unfair for every interface 
to have this check added to the output path, but then I remember that we check 
to see if every interface is part of a bridge on output too, even non ethernet 
ones like gif and pppoe, and even though that makes me sad it's working ok in 
practice.

> In other words I suggest you only need to modify what happens *after*
> if_enqueue(), the driver part or actual drv_start().  Why about having
> a generic:
> 
>   if (ifp->if_enqueue != NULL && (ifq_is_priq(&ifp->if_snd)) {
>   counters_pkt(ifp->if_counters, ifc_opackets, ifc_obytes,
>   m->m_pkthdr.len); 
>

Re: request for testing: patch for boot loader out of mem

2018-12-13 Thread Nick Holland
On 12/11/18 08:09, Otto Moerbeek wrote:
> On Mon, Dec 10, 2018 at 11:44:47AM +0100, Otto Moerbeek wrote:
> 
>> On Mon, Dec 10, 2018 at 08:30:10AM +0100, Otto Moerbeek wrote:
>> 
>> > Hi,
>> > 
>> > the bootloader uses a very simple allocator for dynamic memory. It
>> > maintains a list of free allocations. If it needs a block, it searches
>> > the freelist and returns the smallest allocation that fits.
>> > 
>> > Allocation patterns like this (starting with an empty freelist)
>> > 
>> > alloc(big)
>> > free(big)
>> > alloc(small)
>> > 
>> > will assigned a big block for the small allocation, wasting most
>> > memory. The allocator does not split up this block. After this, a new
>> > big allocation will grow the heap with the big amount. This diff
>> > changes the strategy by not re-using a block from the free list if
>> > half the space or more would be wasted. Instead, it grows the heap by
>> > the requested amount.
>> > 
>> > This make it possible for me to boot using a root fs with a large
>> > blocksize. There have been several reports of large roots not working
>> > (the bootloader allocates memory based om the blocksize of the file
>> > system, and by default larger filesystems use larger blocks).
>> > 
>> > How to test
>> > ===
>> > 
>> > Apply diff and do a full build including building release. After that,
>> > either upgrade using your newly built cd64.iso, bsd.rd or other
>> > mechanism or do a full install. Test that you can boot afterwards.
>> > 
>> > This needs to be tested on various platforms, both will small and big
>> > (> 600G) root filesystems.  Yes, this is tedious, but we want large
>> > coverage of different cases.
>> > 
>> >-Otto
>> 
>> As it turns out by my own testing, on amd64 root filssytems using 32k
>> blocks now work fine, but 64k fs blocks still hit a ceiling. This
>> corresponds to > 512G disks if you use the defaults.
>> 
>>  -Otto
>> 
> 
> New diff that also works on root filesystems > 500G. It avoid using a
> large bouncebuffer by reding large buffers in a loop instead of one go.
> 
>   -Otto

You are my hero.
Seems it is possible to hose a system by making a 32k block size on a
system with a root file system of only 500MB.  I really don't know how I
did this, much less why, but it's been causing me reboot problems for
over a year now.

Upgraded to today's snap, problem solved.

Nick.

/home/nick $ dmesg|head  
OpenBSD 6.4-current (GENERIC.MP) #510: Thu Dec 13 06:20:42 MST 2018
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP


> p m
OpenBSD area: 64-2000397735; size: 976756.7M; free: 859532.4M
#size   offset  fstype [fsize bsize   cpg]
  a:   502.0M   64  4.2BSD   4096 32768 1 # / # wtf?
  b: 20473.5M   1048578560swap# 
  c:976762.3M0  unused
  d: 10244.6M   1090508288  4.2BSD   2048 16384 1 # /usr
  e:  4094.7M   489152  4.2BSD   2048 16384 1 # /tmp
  f: 10236.7M   1119875072  4.2BSD   2048 16384 1 # /var
  g: 20473.5M   1161804704  4.2BSD   2048 16384 1 # /repo
  h: 10236.7M   1140839904  4.2BSD   2048 16384 1 # /home
  i: 7.8M   1203734368  4.2BSD   2048 16384 1 # 
  j: 40954.8M   1203750432  4.2BSD   2048 16384 1 


> Index: arch/amd64/stand/libsa/biosdev.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/stand/libsa/biosdev.c,v
> retrieving revision 1.32
> diff -u -p -r1.32 biosdev.c
> --- arch/amd64/stand/libsa/biosdev.c  10 Aug 2018 16:41:35 -  1.32
> +++ arch/amd64/stand/libsa/biosdev.c  11 Dec 2018 13:00:02 -
> @@ -340,11 +340,26 @@ biosd_io(int rw, bios_diskinfo_t *bd, u_
>   return error;
>  }
>  
> +#define MAXSECTS 32
> +
>  int
>  biosd_diskio(int rw, struct diskinfo *dip, u_int off, int nsect, void *buf)
>  {
> - return biosd_io(rw, &dip->bios_info, off, nsect, buf);
> + int i, n, ret;
> +
> + /*
> +  * Avoid doing too large reads, the bounce buffer used by biosd_io()
> +  * might run us out-of-mem.
> +  */
> + for (i = 0, ret = 0; ret == 0 && nsect > 0;
> + i += MAXSECTS, nsect -= MAXSECTS) {
> + n = nsect >= MAXSECTS ? MAXSECTS : nsect;
> + ret = biosd_io(rw, &dip->bios_info, off + i, n,
> + buf + i * DEV_BSIZE);
> + }
> + return ret;
>  }
> +
>  /*
>   * Try to read the bsd label on the given BIOS device.
>   */
> @@ -715,7 +730,6 @@ biosstrategy(void *devdata, int rw, dadd
>  size_t *rsize)
>  {
>   struct diskinfo *dip = (struct diskinfo *)devdata;
> - bios_diskinfo_t *bd = &dip->bios_info;
>   u_int8_t error = 0;
>   size_t nsect;
>  
> @@ -732,7 +746,7 @@ biosstrategy(void *devdata, int rw, dadd
>   if (blk < 0)
>   error = EINVAL;
>   else
> - error = bios

Re: request for testing: patch for boot loader out of mem

2018-12-13 Thread Otto Moerbeek
On Thu, Dec 13, 2018 at 09:50:13PM -0500, Nick Holland wrote:

> On 12/11/18 08:09, Otto Moerbeek wrote:
> > On Mon, Dec 10, 2018 at 11:44:47AM +0100, Otto Moerbeek wrote:
> > 
> >> On Mon, Dec 10, 2018 at 08:30:10AM +0100, Otto Moerbeek wrote:
> >> 
> >> > Hi,
> >> > 
> >> > the bootloader uses a very simple allocator for dynamic memory. It
> >> > maintains a list of free allocations. If it needs a block, it searches
> >> > the freelist and returns the smallest allocation that fits.
> >> > 
> >> > Allocation patterns like this (starting with an empty freelist)
> >> > 
> >> > alloc(big)
> >> > free(big)
> >> > alloc(small)
> >> > 
> >> > will assigned a big block for the small allocation, wasting most
> >> > memory. The allocator does not split up this block. After this, a new
> >> > big allocation will grow the heap with the big amount. This diff
> >> > changes the strategy by not re-using a block from the free list if
> >> > half the space or more would be wasted. Instead, it grows the heap by
> >> > the requested amount.
> >> > 
> >> > This make it possible for me to boot using a root fs with a large
> >> > blocksize. There have been several reports of large roots not working
> >> > (the bootloader allocates memory based om the blocksize of the file
> >> > system, and by default larger filesystems use larger blocks).
> >> > 
> >> > How to test
> >> > ===
> >> > 
> >> > Apply diff and do a full build including building release. After that,
> >> > either upgrade using your newly built cd64.iso, bsd.rd or other
> >> > mechanism or do a full install. Test that you can boot afterwards.
> >> > 
> >> > This needs to be tested on various platforms, both will small and big
> >> > (> 600G) root filesystems.  Yes, this is tedious, but we want large
> >> > coverage of different cases.
> >> > 
> >> >  -Otto
> >> 
> >> As it turns out by my own testing, on amd64 root filssytems using 32k
> >> blocks now work fine, but 64k fs blocks still hit a ceiling. This
> >> corresponds to > 512G disks if you use the defaults.
> >> 
> >>-Otto
> >> 
> > 
> > New diff that also works on root filesystems > 500G. It avoid using a
> > large bouncebuffer by reding large buffers in a loop instead of one go.
> > 
> > -Otto
> 
> You are my hero.
> Seems it is possible to hose a system by making a 32k block size on a
> system with a root file system of only 500MB.  I really don't know how I
> did this, much less why, but it's been causing me reboot problems for
> over a year now.

Default block size is 16k. Filesystems > 128G get a block size of 32k
and > 512G get 64k boock size. Fragments remain at 8 per block.

I don't know what you did with your 512M fs to make it get a 32k
blocksize. Did it have a large partition earlier and then you reduced
it's size? In that case the bs will remain the same, even if you
newfs, since newfs reads the bs from the label if not given on the
command line.

The i386 and amd64 bootloaders are very tight on mem, they live in a
640k world. Some space is used for the stack and the code itself also
needs to fit in. Only a few 100k of heap remain. In various places
blocks of file system block size are allocated so you'll hit the limit
pretty quickly. The cpu firmware microcode loading function having a
mem leak (fixed already by jsing) also didn't help. 

Anyway, thanks for confirming my tests, even if by accident ;-)

-Otto

> 
> Upgraded to today's snap, problem solved.
> 
> Nick.
> 
> /home/nick $ dmesg|head  
> OpenBSD 6.4-current (GENERIC.MP) #510: Thu Dec 13 06:20:42 MST 2018
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
> 
> > p m
> OpenBSD area: 64-2000397735; size: 976756.7M; free: 859532.4M
> #size   offset  fstype [fsize bsize   cpg]
>   a:   502.0M   64  4.2BSD   4096 32768 1 # / # wtf?
>   b: 20473.5M   1048578560swap# 
>   c:976762.3M0  unused
>   d: 10244.6M   1090508288  4.2BSD   2048 16384 1 # /usr
>   e:  4094.7M   489152  4.2BSD   2048 16384 1 # /tmp
>   f: 10236.7M   1119875072  4.2BSD   2048 16384 1 # /var
>   g: 20473.5M   1161804704  4.2BSD   2048 16384 1 # /repo
>   h: 10236.7M   1140839904  4.2BSD   2048 16384 1 # /home
>   i: 7.8M   1203734368  4.2BSD   2048 16384 1 # 
>   j: 40954.8M   1203750432  4.2BSD   2048 16384 1 
> 
> 
> > Index: arch/amd64/stand/libsa/biosdev.c
> > ===
> > RCS file: /cvs/src/sys/arch/amd64/stand/libsa/biosdev.c,v
> > retrieving revision 1.32
> > diff -u -p -r1.32 biosdev.c
> > --- arch/amd64/stand/libsa/biosdev.c10 Aug 2018 16:41:35 -  
> > 1.32
> > +++ arch/amd64/stand/libsa/biosdev.c11 Dec 2018 13:00:02 -
> > @@ -340,11 +340,26 @@ biosd_io(int rw, bios_diskinfo_t *bd, u_