Re: realloc(3) and MALLOC_MOVE

2017-01-27 Thread Otto Moerbeek
On Tue, Jan 24, 2017 at 10:07:23AM +0100, Otto Moerbeek wrote:

> Hi,
> 
> malloc(3) has the nice feature to move (subject to alignment
> constraints) allocations that are between the max chunk size (half a
> page) and a page size towards the end of the allocated page, to catch
> more buffer overflows. Due to the allocation being higher up within a
> page, buffer overflows will end up beyond the page in more cases.
> 
> Until now, realloc(3) did not handle this in a smart way. When it
> encountered an existing moved allocation it always did a
> malloc-copy-free dance.
> 
> This diff fixes that, at the same time doing more cases without
> requesting new pages from the kernel. 
> 
> Please review and test,
> 
> BTW, we cannot use this feature for allocations larger than or equal
> to a page, since these should return page-aligned pointers.
> 
>   -Otto
> 

Anybody testing this? I like to make progress on this...

-Otto

> Index: malloc.c
> ===
> RCS file: /cvs/src/lib/libc/stdlib/malloc.c,v
> retrieving revision 1.212
> diff -u -p -r1.212 malloc.c
> --- malloc.c  21 Jan 2017 07:47:42 -  1.212
> +++ malloc.c  24 Jan 2017 06:08:13 -
> @@ -1328,13 +1328,9 @@ ofree(struct dir_info *argpool, void *p)
>   sz - mopts.malloc_guard,
>   PAGEROUND(sz - mopts.malloc_guard));
>   } else {
> -#if notyetbecause_of_realloc
>   /* shifted towards the end */
> - if (p != ((char *)r->p) + ((MALLOC_PAGESIZE -
> - MALLOC_MINSIZE - sz - mopts.malloc_guard) &
> - ~(MALLOC_MINSIZE-1))) {
> - }
> -#endif
> + if (p != MALLOC_MOVE(r->p, sz))
> + wrterror(pool, "bogus moved pointer %p", p);
>   p = r->p;
>   }
>   if (mopts.malloc_guard) {
> @@ -1474,7 +1470,7 @@ orealloc(struct dir_info *argpool, void 
>   if (gnewsz > MALLOC_MAXCHUNK)
>   gnewsz += mopts.malloc_guard;
>  
> - if (newsz > MALLOC_MAXCHUNK && oldsz > MALLOC_MAXCHUNK && p == r->p &&
> + if (newsz > MALLOC_MAXCHUNK && oldsz > MALLOC_MAXCHUNK &&
>   !mopts.malloc_realloc) {
>   /* First case: from n pages sized allocation to m pages sized
>  allocation, no malloc_move in effect */
> @@ -1484,7 +1480,7 @@ orealloc(struct dir_info *argpool, void 
>   if (rnewsz > roldsz) {
>   /* try to extend existing region */
>   if (!mopts.malloc_guard) {
> - void *hint = (char *)p + roldsz;
> + void *hint = (char *)r->p + roldsz;
>   size_t needed = rnewsz - roldsz;
>  
>   STATS_INC(pool->cheap_realloc_tries);
> @@ -1502,9 +1498,15 @@ gotit:
>   STATS_ADD(pool->malloc_used, needed);
>   if (mopts.malloc_junk == 2)
>   memset(q, SOME_JUNK, needed);
> - r->size = newsz;
> + r->size = gnewsz;
> + if (r->p != p) {
> + /* old pointer is moved */
> + memmove(r->p, p, oldsz);
> + p = r->p;
> + }
>   if (mopts.chunk_canaries)
> - fill_canary(p, newsz, 
> PAGEROUND(newsz));
> + fill_canary(p, newsz,
> + PAGEROUND(newsz));
>   STATS_SETF(r, f);
>   STATS_INC(pool->cheap_reallocs);
>   ret = p;
> @@ -1517,30 +1519,45 @@ gotit:
>   } else if (rnewsz < roldsz) {
>   /* shrink number of pages */
>   if (mopts.malloc_guard) {
> - if (mprotect((char *)p + roldsz -
> + if (mprotect((char *)r->p + roldsz -
>   mopts.malloc_guard, mopts.malloc_guard,
>   PROT_READ | PROT_WRITE))
>   wrterror(pool, "mprotect");
> - if (mprotect((char *)p + rnewsz -
> + if (mprotect((char *)r->p + rnewsz -
>   mopts.malloc_guard, mopts.malloc_guard,
>   PROT_NONE))
>   wrterror(pool, "mprotect");
>   }
> -

Re: Help with the NET_LOCK()

2017-01-27 Thread Hrvoje Popovski
On 27.1.2017. 20:33, David Hill wrote:
> On Fri, Jan 27, 2017 at 08:09:36PM +0100, Hrvoje Popovski wrote:
>> On 27.1.2017. 19:14, David Hill wrote:
 splassert: yield: want 0 have 1
 Starting stack trace...
 yield() at yield+0xac
 pool_get() at pool_get+0x1ca
 m_get() at m_get+0x28
 ip_ctloutput() at ip_ctloutput+0x4bf
 sogetopt() at sogetopt+0x7e
 sys_getsockopt() at sys_getsockopt+0xbf
 syscall() at syscall+0x27b
 --- syscall (number 118) ---
 end of kernel
 end trace frame: 0x3, count: 250
 0x978bdd844a:
 End of stack trace.
  

>>> Attempted to solve this and am running with this diff:
>>
>> Hi,
>>
>> i applied you patch and i'm still seeing this trace
>>
>>
>> splassert: yield: want 0 have 1
>> Starting stack trace...
>> yield() at yield+0xac
>> pool_get() at pool_get+0x1ca
>> m_get() at m_get+0x28
>> ip_ctloutput() at ip_ctloutput+0x4bf
>> sogetopt() at sogetopt+0xa1
>> sys_getsockopt() at sys_getsockopt+0xbf
>> syscall() at syscall+0x27b
>> --- syscall (number 118) ---
>> end of kernel
>> end trace frame: 0x3, count: 250
>> 0x178f12db8f1a:
>> End of stack trace.
>>
>>
>> and this one i'm seeing for first time, maybe because of this diff
>>
>> splassert: yield: want 0 have 1
>> Starting stack trace...
>> yield() at yield+0xac
>> malloc() at malloc+0x406
>> ip_setmoptions() at ip_setmoptions+0x248
>> ip_ctloutput() at ip_ctloutput+0x461
>> sosetopt() at sosetopt+0x8e
>> sys_setsockopt() at sys_setsockopt+0x12d
>> syscall() at syscall+0x27b
>> --- syscall (number 105) ---
>> end of kernel
>> end trace frame: 0x1f83, count: 250
>> 0x91243a37f1a:
>> End of stack trace.
>>
> Forgot a file...   Try this:


With this patch i can't see syscall 118

tnx ...



Re: [WWW] Reverse chronological order for faq/current.html

2017-01-27 Thread Raf Czlonka
On Tue, Jan 24, 2017 at 10:29:30AM GMT, Theo de Raadt wrote:
> > On 2017/01/24 09:06, Raf Czlonka wrote:
> > > Another way to look at it is, "Let me have a look if there's anything
> > > new on faq/current.html - I open the page and, *without* moving
> > > forward, can see straight away if something new has been added.
> > 
> > Since we've been doing it the other way for 12 years, I think it would
> > likely cause confusion for existing users..
> 
> For Raf,
> 
> http://tinyurl.com/jakb5bb
> 

I was expecting that one. Still, made me chuckle :^)



Re: [WWW] Reverse chronological order for faq/current.html

2017-01-27 Thread Raf Czlonka
On Tue, Jan 24, 2017 at 01:13:03PM GMT, Nick Holland wrote:
> On 01/24/17 04:06, Raf Czlonka wrote:
> ...
> > Another way to look at it is, "Let me have a look if there's anything
> > new on faq/current.html - I open the page and, *without* moving
> > forward, can see straight away if something new has been added. No?
> > Then I move on with my life without scrolling down or doing anything
> > else apart from opening the page". Given OpenBSD's rapid development,
> > new entries on faq/current.html appear quite frequently - I'm only
> > thinking of the tiny amount of time saved each time.
> 
> What I think you are not thinking of is that in addition to being a list
> of things that have changed, it is also a list of changes that have to
> be done ... often IN PARTICULAR ORDER.
> 
> As it is, you read down until you hit where you are, then follow the
> instructions in order.  "more difficult" in your argument, but logical.
> 
> As you propose, you read down until you find where you are not, then
> change directions and read backwards.  That's not intuitive, normal, or
> reasonable to expect.  Most likely, your plan will have people making
> changes in reverse order...which may often work, but sometimes
> won't...and won't be the order the developers will be testing.

Hi Nick,

This is the most reasonable reply I have received thus far :^)

Thanks,

Raf



Re: [WWW] Reverse chronological order for faq/current.html

2017-01-27 Thread Raf Czlonka
On Tue, Jan 24, 2017 at 10:26:21AM GMT, Stuart Henderson wrote:
> On 2017/01/24 09:06, Raf Czlonka wrote:
> > Another way to look at it is, "Let me have a look if there's anything
> > new on faq/current.html - I open the page and, *without* moving
> > forward, can see straight away if something new has been added.
> 
> Since we've been doing it the other way for 12 years, I think it would
> likely cause confusion for existing users..

I've read somewhere that "We've always done it this way." is the
most dangerous phrase in the language :^)

I completely agree with the latter - every major change requires
re-education.

> > Then I move on with my life without scrolling down or doing anything
> > else apart from opening the page". Given OpenBSD's rapid development,
> > new entries on faq/current.html appear quite frequently - I'm only
> > thinking of the tiny amount of time saved each time.
> 
> If you're running current, I'd recommend keeping an eye on the
> source-changes list, then you'll already know if there's something new
> which affects you :-)
> 

I've been doing that for a long time but cannot always keep up with
the volume :^)

Cheers,

Raf



Re: [WWW] Reverse chronological order for faq/current.html

2017-01-27 Thread Raf Czlonka
On Tue, Jan 24, 2017 at 09:13:51AM GMT, STeve Andre' wrote:
> On 01/24/17 04:08, Theo de Raadt wrote:
> > > Another way to look at it is, "Let me have a look if there's anything
> > > new on faq/current.html - I open the page and, *without* moving
> > > forward, can see straight away if something new has been added. No?
> > > Then I move on with my life without scrolling down or doing anything
> > > else apart from opening the page". Given OpenBSD's rapid development,
> > > new entries on faq/current.html appear quite frequently - I'm only
> > > thinking of the tiny amount of time saved each time.
> > 
> > Yes clearly I'm not considering your valuable time.
> > 
> > 
> 
> Raf, think about the physical world.  When people add things to a list
> like a posting on a bulletin board, it goes at the end.  People just
> know to look at the end for anything new.  So it is online.  The effort
> to scroll down is pretty small.

STeve, I've already given an example where reverse chronology is
being used, another being CVS revision history, i.e. [0], so the
above isn't always true.

Regards,

Raf

[0] http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/Makefile



Re: [WWW] Reverse chronological order for faq/current.html

2017-01-27 Thread Raf Czlonka
On Tue, Jan 24, 2017 at 09:08:07AM GMT, Theo de Raadt wrote:
> > Another way to look at it is, "Let me have a look if there's anything
> > new on faq/current.html - I open the page and, *without* moving
> > forward, can see straight away if something new has been added. No?
> > Then I move on with my life without scrolling down or doing anything
> > else apart from opening the page". Given OpenBSD's rapid development,
> > new entries on faq/current.html appear quite frequently - I'm only
> > thinking of the tiny amount of time saved each time.
> 
> Yes clearly I'm not considering your valuable time.
> 

There's no need for that.



Re: err with multiple TLS sites but one OCSP?

2017-01-27 Thread Michael W. Lucas
On Fri, Jan 27, 2017 at 09:53:25PM +, Bob Beck wrote:
>On Fri, Jan 27, 2017 at 14:12 Michael W. Lucas
>  Or a misconfiguration. ? show configs


Configs follow.

# cat /etc/httpd.conf
include "/etc/sites/www3.conf"
include "/etc/sites/www4.conf"

www3.conf:

server "www3.mwlucas.org" {
   listen on * port 80
   block return 302 "https://$SERVER_NAME$REQUEST_URI;
}


server "www3.mwlucas.org" {
alias tarpit.mwlucas.org
listen on * tls port 443
hsts
# TLS certificate and key files created with acme-client(1)
tls certificate "/etc/ssl/acme/www3/www3.fullchain.pem"
tls key "/etc/ssl/acme/www3/www3.key"
tls ocsp "/etc/ssl/acme/www3/www3.der"
tcp nodelay

   location "/.well-known/acme-challenge/*" {
   root "/acme"
   root strip 2
   }
}


www4:

server "www4.mwlucas.org" {
alias bill.mwlucas.org
alias auction.mwlucas.org
listen on * port 80

   location "/.well-known/acme-challenge/*" {
   root "/acme"
   root strip 2
   }


block return 301 "https://$DOCUMENT_URI;
}

server "www4.mwlucas.org" {
alias bill.mwlucas.org
alias auction.mwlucas.org
root "/www4"
listen on * tls port 443
hsts
# TLS certificate and key files created with acme-client(1)
tls certificate "/etc/ssl/acme/www4/www4.fullchain.pem"
tls key "/etc/ssl/acme/www4/www4.key"
#   tls ocsp "/etc/ssl/acme/www4/www4.der"
tcp nodelay
   location "/.well-known/acme-challenge/*" {
   root "/acme"
   root strip 2
   }

}




-- 
Michael W. LucasTwitter @mwlauthor 
nonfiction: https://www.michaelwlucas.com/
fiction: https://www.michaelwarrenlucas.com/
blog: http://blather.michaelwlucas.com/



Re: err with multiple TLS sites but one OCSP?

2017-01-27 Thread Bob Beck
On Fri, Jan 27, 2017 at 15:23 Stuart Henderson  wrote:

> On 2017/01/27 22:09, Bob Beck wrote:
>
> > I think you have more issues than ocsp. if thats the same host you can't
>
> > have two different tls certs on the same ip.   and you have them both on
>
> > *443
>
> >
>
> > try using a separate ip for each
>
>
>
> Wasn't SNI support added to httpd already?
>
> hmmm. right. but i bet itll work with explicit separate ip's.  stapling on
> the other hand will be per config. so thats probably whats fighting.
> separate ip would confirm that.


> im tired. ill look at it tomorrow unless someone else does
>
>
>


Re: err with multiple TLS sites but one OCSP?

2017-01-27 Thread Stuart Henderson
On 2017/01/27 22:09, Bob Beck wrote:
> I think you have more issues than ocsp. if thats the same host you can't
> have two different tls certs on the same ip.   and you have them both on
> *443
> 
> try using a separate ip for each

Wasn't SNI support added to httpd already?



Re: err with multiple TLS sites but one OCSP?

2017-01-27 Thread Bob Beck
I think you have more issues than ocsp. if thats the same host you can't
have two different tls certs on the same ip.   and you have them both on
*443

try using a separate ip for each



On Fri, Jan 27, 2017 at 15:03 Michael W. Lucas 
wrote:

> On Fri, Jan 27, 2017 at 09:53:25PM +, Bob Beck wrote:
>
> >On Fri, Jan 27, 2017 at 14:12 Michael W. Lucas
>
> >  Or a misconfiguration. Â show configs
>
>
>
>
>
> Configs follow.
>
>
>
> # cat /etc/httpd.conf
>
> include "/etc/sites/www3.conf"
>
> include "/etc/sites/www4.conf"
>
>
>
> www3.conf:
>
>
>
> server "www3.mwlucas.org" {
>
>listen on * port 80
>
>block return 302 "https://$SERVER_NAME$REQUEST_URI;
>
> }
>
>
>
>
>
> server "www3.mwlucas.org" {
>
> alias tarpit.mwlucas.org
>
> listen on * tls port 443
>
> hsts
>
> # TLS certificate and key files created with acme-client(1)
>
> tls certificate "/etc/ssl/acme/www3/www3.fullchain.pem"
>
> tls key "/etc/ssl/acme/www3/www3.key"
>
> tls ocsp "/etc/ssl/acme/www3/www3.der"
>
> tcp nodelay
>
>
>
>location "/.well-known/acme-challenge/*" {
>
>root "/acme"
>
>root strip 2
>
>}
>
> }
>
>
>
>
>
> www4:
>
>
>
> server "www4.mwlucas.org" {
>
> alias bill.mwlucas.org
>
> alias auction.mwlucas.org
>
> listen on * port 80
>
>
>
>location "/.well-known/acme-challenge/*" {
>
>root "/acme"
>
>root strip 2
>
>}
>
>
>
>
>
> block return 301 "https://$DOCUMENT_URI;
>
> }
>
>
>
> server "www4.mwlucas.org" {
>
> alias bill.mwlucas.org
>
> alias auction.mwlucas.org
>
> root "/www4"
>
> listen on * tls port 443
>
> hsts
>
> # TLS certificate and key files created with acme-client(1)
>
> tls certificate "/etc/ssl/acme/www4/www4.fullchain.pem"
>
> tls key "/etc/ssl/acme/www4/www4.key"
>
> #   tls ocsp "/etc/ssl/acme/www4/www4.der"
>
> tcp nodelay
>
>location "/.well-known/acme-challenge/*" {
>
>root "/acme"
>
>root strip 2
>
>}
>
>
>
> }
>
>
>
>
>
>
>
>
>
> --
>
> Michael W. LucasTwitter @mwlauthor
>
> nonfiction: https://www.michaelwlucas.com/
>
> fiction: https://www.michaelwarrenlucas.com/
>
> blog: http://blather.michaelwlucas.com/
>
>


Re: err with multiple TLS sites but one OCSP?

2017-01-27 Thread Bob Beck
On Fri, Jan 27, 2017 at 14:12 Michael W. Lucas 
wrote:

> On Fri, Jan 27, 2017 at 02:50:29PM -0500, Michael W. Lucas wrote:
>
> > On Fri, Jan 27, 2017 at 06:49:06PM +, Stuart Henderson wrote:
>
> > > That looks like a web server bug, it shouldn't return a staple
>
>
> Or a misconfiguration.  show configs
>
>
> > > in that case.  What software are you using for that?
>
> >
>
> > 
>
> >
>
> > OpenBSD httpd, of course. amd64 snapshot downloaded yesterday from
>
> > ftp3.usa.openbsd.org.
>
>
>
> To be clear, that's a "How the hell could I forget to include that?"
>
> facepalm, not anything about Stuart asking the question...
>
>
>
> --
>
> Michael W. LucasTwitter @mwlauthor
>
> nonfiction: https://www.michaelwlucas.com/
>
> fiction: https://www.michaelwarrenlucas.com/
>
> blog: http://blather.michaelwlucas.com/
>
>
>
>


Re: err with multiple TLS sites but one OCSP?

2017-01-27 Thread Michael W. Lucas
On Fri, Jan 27, 2017 at 02:50:29PM -0500, Michael W. Lucas wrote:
> On Fri, Jan 27, 2017 at 06:49:06PM +, Stuart Henderson wrote:
> > That looks like a web server bug, it shouldn't return a staple
> > in that case.  What software are you using for that?
> 
> 
> 
> OpenBSD httpd, of course. amd64 snapshot downloaded yesterday from
> ftp3.usa.openbsd.org.

To be clear, that's a "How the hell could I forget to include that?"
facepalm, not anything about Stuart asking the question...

-- 
Michael W. LucasTwitter @mwlauthor 
nonfiction: https://www.michaelwlucas.com/
fiction: https://www.michaelwarrenlucas.com/
blog: http://blather.michaelwlucas.com/



Re: err with multiple TLS sites but one OCSP?

2017-01-27 Thread Michael W. Lucas
On Fri, Jan 27, 2017 at 06:49:06PM +, Stuart Henderson wrote:
> That looks like a web server bug, it shouldn't return a staple
> in that case.  What software are you using for that?



OpenBSD httpd, of course. amd64 snapshot downloaded yesterday from
ftp3.usa.openbsd.org.

==ml

-- 
Michael W. LucasTwitter @mwlauthor 
nonfiction: https://www.michaelwlucas.com/
fiction: https://www.michaelwarrenlucas.com/
blog: http://blather.michaelwlucas.com/



Re: Help with the NET_LOCK()

2017-01-27 Thread David Hill
On Fri, Jan 27, 2017 at 08:09:36PM +0100, Hrvoje Popovski wrote:
> On 27.1.2017. 19:14, David Hill wrote:
> >> splassert: yield: want 0 have 1
> >> Starting stack trace...
> >> yield() at yield+0xac
> >> pool_get() at pool_get+0x1ca
> >> m_get() at m_get+0x28
> >> ip_ctloutput() at ip_ctloutput+0x4bf
> >> sogetopt() at sogetopt+0x7e
> >> sys_getsockopt() at sys_getsockopt+0xbf
> >> syscall() at syscall+0x27b
> >> --- syscall (number 118) ---
> >> end of kernel
> >> end trace frame: 0x3, count: 250
> >> 0x978bdd844a:
> >> End of stack trace.
> >>  
> >>
> > Attempted to solve this and am running with this diff:
> 
> 
> Hi,
> 
> i applied you patch and i'm still seeing this trace
> 
> 
> splassert: yield: want 0 have 1
> Starting stack trace...
> yield() at yield+0xac
> pool_get() at pool_get+0x1ca
> m_get() at m_get+0x28
> ip_ctloutput() at ip_ctloutput+0x4bf
> sogetopt() at sogetopt+0xa1
> sys_getsockopt() at sys_getsockopt+0xbf
> syscall() at syscall+0x27b
> --- syscall (number 118) ---
> end of kernel
> end trace frame: 0x3, count: 250
> 0x178f12db8f1a:
> End of stack trace.
> 
> 
> and this one i'm seeing for first time, maybe because of this diff
> 
> splassert: yield: want 0 have 1
> Starting stack trace...
> yield() at yield+0xac
> malloc() at malloc+0x406
> ip_setmoptions() at ip_setmoptions+0x248
> ip_ctloutput() at ip_ctloutput+0x461
> sosetopt() at sosetopt+0x8e
> sys_setsockopt() at sys_setsockopt+0x12d
> syscall() at syscall+0x27b
> --- syscall (number 105) ---
> end of kernel
> end trace frame: 0x1f83, count: 250
> 0x91243a37f1a:
> End of stack trace.
>

Forgot a file...   Try this:

 
Index: kern/uipc_socket.c
===
RCS file: /cvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.174
diff -u -p -r1.174 uipc_socket.c
--- kern/uipc_socket.c  26 Jan 2017 00:08:50 -  1.174
+++ kern/uipc_socket.c  27 Jan 2017 19:30:31 -
@@ -1758,11 +1758,19 @@ sogetopt(struct socket *so, int level, i
 
if (level != SOL_SOCKET) {
if (so->so_proto && so->so_proto->pr_ctloutput) {
+   m = m_get(M_WAIT, MT_SOOPTS);
+   m->m_len = 0;
+
NET_LOCK(s);
error = (*so->so_proto->pr_ctloutput)(PRCO_GETOPT, so,
-   level, optname, mp);
+   level, optname, );
NET_UNLOCK(s);
-   return (error);
+   if (error) {
+   m_free(m);
+   return (error);
+   }
+   *mp = m;
+   return (0);
} else
return (ENOPROTOOPT);
} else {
@@ -1835,21 +1843,25 @@ sogetopt(struct socket *so, int level, i
}
 
case SO_RTABLE:
-   (void)m_free(m);
if (so->so_proto && so->so_proto->pr_domain &&
so->so_proto->pr_domain->dom_protosw &&
so->so_proto->pr_ctloutput) {
struct domain *dom = so->so_proto->pr_domain;
 
level = dom->dom_protosw->pr_protocol;
+   
NET_LOCK(s);
error = (*so->so_proto->pr_ctloutput)
-   (PRCO_GETOPT, so, level, optname, mp);
+   (PRCO_GETOPT, so, level, optname, );
NET_UNLOCK(s);
-   return (error);
+   if (error) {
+   (void)m_free(m);
+   return (error);
+   }
+   break;
}
+   (void)m_free(m);
return (ENOPROTOOPT);
-   break;
 
 #ifdef SOCKET_SPLICE
case SO_SPLICE:
@@ -1880,7 +1892,6 @@ sogetopt(struct socket *so, int level, i
}
(void)m_free(m);
return (EOPNOTSUPP);
-   break;
 
default:
(void)m_free(m);
Index: net/rtsock.c
===
RCS file: /cvs/src/sys/net/rtsock.c,v
retrieving revision 1.220
diff -u -p -r1.220 rtsock.c
--- net/rtsock.c24 Jan 2017 00:17:14 -  1.220
+++ net/rtsock.c27 Jan 2017 19:30:31 -
@@ -277,12 +277,12 @@ route_ctloutput(int op, struct socket *s
case PRCO_GETOPT:
switch (optname) {
case ROUTE_MSGFILTER:
-   *mp = m = m_get(M_WAIT, MT_SOOPTS);
+   m = *mp;
m->m_len = sizeof(unsigned 

Re: Help with the NET_LOCK()

2017-01-27 Thread Hrvoje Popovski
On 27.1.2017. 19:14, David Hill wrote:
>> splassert: yield: want 0 have 1
>> Starting stack trace...
>> yield() at yield+0xac
>> pool_get() at pool_get+0x1ca
>> m_get() at m_get+0x28
>> ip_ctloutput() at ip_ctloutput+0x4bf
>> sogetopt() at sogetopt+0x7e
>> sys_getsockopt() at sys_getsockopt+0xbf
>> syscall() at syscall+0x27b
>> --- syscall (number 118) ---
>> end of kernel
>> end trace frame: 0x3, count: 250
>> 0x978bdd844a:
>> End of stack trace.
>>  
>>
> Attempted to solve this and am running with this diff:


Hi,

i applied you patch and i'm still seeing this trace


splassert: yield: want 0 have 1
Starting stack trace...
yield() at yield+0xac
pool_get() at pool_get+0x1ca
m_get() at m_get+0x28
ip_ctloutput() at ip_ctloutput+0x4bf
sogetopt() at sogetopt+0xa1
sys_getsockopt() at sys_getsockopt+0xbf
syscall() at syscall+0x27b
--- syscall (number 118) ---
end of kernel
end trace frame: 0x3, count: 250
0x178f12db8f1a:
End of stack trace.


and this one i'm seeing for first time, maybe because of this diff

splassert: yield: want 0 have 1
Starting stack trace...
yield() at yield+0xac
malloc() at malloc+0x406
ip_setmoptions() at ip_setmoptions+0x248
ip_ctloutput() at ip_ctloutput+0x461
sosetopt() at sosetopt+0x8e
sys_setsockopt() at sys_setsockopt+0x12d
syscall() at syscall+0x27b
--- syscall (number 105) ---
end of kernel
end trace frame: 0x1f83, count: 250
0x91243a37f1a:
End of stack trace.



Re: err with multiple TLS sites but one OCSP?

2017-01-27 Thread Stuart Henderson
On 2017/01/27 13:10, Michael W. Lucas wrote:
> Hi,
> 
> Not sure if this is an expected part of OCSP or a bug.
> 
> I've configured two TLS sites on one host, one with OCSP stapling
> (www3.mwlucas.org) and one without (www4.mwlucas.org). The OCSP site
> works fine, but the non-OCSP site generates an err.
> 
> It *appears* that queries to the non-OCSP site return the OCSP site's
> OCSP cert.
> 
> Following please find openssl queries on both. Feel free to check the
> sites yourself, I'm FAR from a TLS guru.

That looks like a web server bug, it shouldn't return a staple
in that case.  What software are you using for that?

> # openssl s_client -connect www4.mwlucas.org:443 -status -servername 
> www4.mwlucas.org
> ...
> verify return:1
> OCSP response:
> ==
> OCSP Response Data:
> OCSP Response Status: successful (0x0)
> Response Type: Basic OCSP Response
> Version: 1 (0x0)
> Responder Id: C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
> Produced At: Jan 26 23:02:00 2017 GMT
> Responses:
> Certificate ID:
>   Hash Algorithm: sha1
>   Issuer Name Hash: 7EE66AE7729AB3FCF8A220646C16A12D6071085D
>   Issuer Key Hash: A84A6A63047DDDBAE6D139B7A64565EFF3A8ECA1
>   Serial Number: 032CBDA721856F117CC7D57A72BBFA77B578
> Cert Status: good
> This Update: Jan 26 23:00:00 2017 GMT
> Next Update: Feb  2 23:00:00 2017 GMT
> 
> Signature Algorithm: sha256WithRSAEncryption
>  6a:1e:f1:44:8c:a9:a6:7e:40:25:3a:f7:50:e9:43:42:0f:74:
>  9b:dc:ee:56:a3:47:0b:ce:73:88:ee:f0:84:fc:b0:25:5b:3d:
>  67:d0:66:20:c7:60:7c:ee:26:91:72:4e:d0:f2:67:5a:e3:c1:
>  06:57:31:47:29:1a:55:19:48:e7:e6:32:0b:18:d9:33:9d:55:
>  d7:36:38:f1:96:57:bc:5d:89:82:31:bb:4e:12:0c:5c:ab:1a:
>  f6:1d:a1:48:be:1c:1d:3b:52:a0:60:2f:1d:f9:3c:48:cd:df:
>  a6:5e:b5:79:0c:b9:ed:d5:61:29:53:ee:83:5f:89:af:35:27:
>  d6:94:05:f5:fb:d1:a8:4d:26:8d:8b:cf:e9:db:53:ad:e6:47:
>  a7:db:91:9e:9d:a1:b2:2c:1e:d9:98:c5:af:5c:12:d1:04:5a:
>  82:be:8d:80:1f:38:c2:5d:b1:6f:99:e1:ca:53:71:1c:85:0d:
>  3e:f3:14:bc:3b:c9:c0:dd:6b:ec:59:d4:54:dc:fb:9c:da:72:
>  91:45:61:55:69:e9:75:51:8f:e2:82:6a:dd:ec:bc:bd:3c:2c:
>  92:43:f7:d9:65:1d:60:14:91:e0:b0:2b:46:25:49:35:74:99:
>  71:a3:c0:d0:91:66:29:7e:01:1b:35:f1:2e:40:dc:f3:4d:98:
>  69:40:6f:46
> 
> 
> # openssl s_client -connect www3.mwlucas.org:443 -status -servername 
> www3.mwlucas.org
> CONNECTED(0003)
> depth=2 O = Digital Signature Trust Co., CN = DST Root CA X3
> verify return:1
> depth=1 C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
> verify return:1
> depth=0 CN = www3.mwlucas.org
> verify return:1
> OCSP response:
> ==
> OCSP Response Data:
> OCSP Response Status: successful (0x0)
> Response Type: Basic OCSP Response
> Version: 1 (0x0)
> Responder Id: C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
> Produced At: Jan 26 23:02:00 2017 GMT
> Responses:
> Certificate ID:
>   Hash Algorithm: sha1
>   Issuer Name Hash: 7EE66AE7729AB3FCF8A220646C16A12D6071085D
>   Issuer Key Hash: A84A6A63047DDDBAE6D139B7A64565EFF3A8ECA1
>   Serial Number: 032CBDA721856F117CC7D57A72BBFA77B578
> Cert Status: good
> This Update: Jan 26 23:00:00 2017 GMT
> Next Update: Feb  2 23:00:00 2017 GMT
> 
> Signature Algorithm: sha256WithRSAEncryption
>  6a:1e:f1:44:8c:a9:a6:7e:40:25:3a:f7:50:e9:43:42:0f:74:
>  9b:dc:ee:56:a3:47:0b:ce:73:88:ee:f0:84:fc:b0:25:5b:3d:
>  67:d0:66:20:c7:60:7c:ee:26:91:72:4e:d0:f2:67:5a:e3:c1:
>  06:57:31:47:29:1a:55:19:48:e7:e6:32:0b:18:d9:33:9d:55:
>  d7:36:38:f1:96:57:bc:5d:89:82:31:bb:4e:12:0c:5c:ab:1a:
>  f6:1d:a1:48:be:1c:1d:3b:52:a0:60:2f:1d:f9:3c:48:cd:df:
>  a6:5e:b5:79:0c:b9:ed:d5:61:29:53:ee:83:5f:89:af:35:27:
>  d6:94:05:f5:fb:d1:a8:4d:26:8d:8b:cf:e9:db:53:ad:e6:47:
>  a7:db:91:9e:9d:a1:b2:2c:1e:d9:98:c5:af:5c:12:d1:04:5a:
>  82:be:8d:80:1f:38:c2:5d:b1:6f:99:e1:ca:53:71:1c:85:0d:
>  3e:f3:14:bc:3b:c9:c0:dd:6b:ec:59:d4:54:dc:fb:9c:da:72:
>  91:45:61:55:69:e9:75:51:8f:e2:82:6a:dd:ec:bc:bd:3c:2c:
>  92:43:f7:d9:65:1d:60:14:91:e0:b0:2b:46:25:49:35:74:99:
>  71:a3:c0:d0:91:66:29:7e:01:1b:35:f1:2e:40:dc:f3:4d:98:
>  69:40:6f:46
> ==
> ...
> 
> ==ml
> 
> 
> -- 
> Michael W. LucasTwitter @mwlauthor 
> nonfiction: https://www.michaelwlucas.com/
> fiction: https://www.michaelwarrenlucas.com/
> blog: http://blather.michaelwlucas.com/
> 



Re: Help with the NET_LOCK()

2017-01-27 Thread David Hill
On Wed, Jan 25, 2017 at 11:14:57AM -0500, David Hill wrote:
> On Wed, Jan 25, 2017 at 04:32:25PM +1000, Martin Pieuchot wrote:
> > I just enabled the NET_LOCK() again and I'm looking for test reports.
> > Please go build a kernel from sources or wait for the next snapshot,
> > run it and report back.
> > 
> > If you're looking for some small coding tasks related to the NET_LOCK()
> > just do:
> > 
> > # sysctl kern.splassert=2
> > # sysctl kern.pool_debug=2
> > 
> > Then watch for the traces on your console.
> > 
> > You'll see something like:
> > 
> > Starting stack trace...
> > yield(0,1,d09dac52,f5549dbc,d94e9378) at yield+0xa4
> > yield(d0bc8f40,1,f5549e18,80,14) at yield+0xa4
> > pool_get(d0bc8f40,1,f5549ec8,d03ecbfb,d97815f4) at pool_get+0x1ba
> > m_get(1,3,f5549ec0,d03a9362,d0bc22e0) at m_get+0x30
> > doaccept(d977e6c4,3,cf7ee4f8,cf7ee4ec,2000) at doaccept+0x193
> > sys_accept(d977e6c4,f5549f5c,f5549f7c,0,f5549fa8) at sys_accept+0x37
> > syscall() at syscall+0x250
> > 
> > This means accept(2) is doing a memory allocation that can sleep, here
> > with m_get(9), while holding the NET_LOCK().  Even if these should be
> > ok, it is easy to avoid them.  In the case of doaccept() a mbuf could
> > be allocated beforehand or simply use the stack for that.
> > 
> > Cheers,
> > Martin
> >
> 
> splassert: yield: want 0 have 1
> Starting stack trace...
> yield() at yield+0xac
> pool_get() at pool_get+0x1ca
> m_get() at m_get+0x28
> ip_ctloutput() at ip_ctloutput+0x4bf
> sogetopt() at sogetopt+0x7e
> sys_getsockopt() at sys_getsockopt+0xbf
> syscall() at syscall+0x27b
> --- syscall (number 118) ---
> end of kernel
> end trace frame: 0x3, count: 250
> 0x978bdd844a:
> End of stack trace.
>  
>

Attempted to solve this and am running with this diff:

Index: kern/uipc_socket.c
===
RCS file: /cvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.174
diff -u -p -r1.174 uipc_socket.c
--- kern/uipc_socket.c  26 Jan 2017 00:08:50 -  1.174
+++ kern/uipc_socket.c  27 Jan 2017 18:08:26 -
@@ -1758,11 +1758,19 @@ sogetopt(struct socket *so, int level, i
 
if (level != SOL_SOCKET) {
if (so->so_proto && so->so_proto->pr_ctloutput) {
+   m = m_get(M_WAIT, MT_SOOPTS);
+   m->m_len = 0;
+
NET_LOCK(s);
error = (*so->so_proto->pr_ctloutput)(PRCO_GETOPT, so,
-   level, optname, mp);
+   level, optname, );
NET_UNLOCK(s);
-   return (error);
+   if (error) {
+   m_free(m);
+   return (error);
+   }
+   *mp = m;
+   return (0);
} else
return (ENOPROTOOPT);
} else {
@@ -1835,21 +1843,25 @@ sogetopt(struct socket *so, int level, i
}
 
case SO_RTABLE:
-   (void)m_free(m);
if (so->so_proto && so->so_proto->pr_domain &&
so->so_proto->pr_domain->dom_protosw &&
so->so_proto->pr_ctloutput) {
struct domain *dom = so->so_proto->pr_domain;
 
level = dom->dom_protosw->pr_protocol;
+   
NET_LOCK(s);
error = (*so->so_proto->pr_ctloutput)
-   (PRCO_GETOPT, so, level, optname, mp);
+   (PRCO_GETOPT, so, level, optname, );
NET_UNLOCK(s);
-   return (error);
+   if (error) {
+   (void)m_free(m);
+   return (error);
+   }
+   break;
}
+   (void)m_free(m);
return (ENOPROTOOPT);
-   break;
 
 #ifdef SOCKET_SPLICE
case SO_SPLICE:
@@ -1880,7 +1892,6 @@ sogetopt(struct socket *so, int level, i
}
(void)m_free(m);
return (EOPNOTSUPP);
-   break;
 
default:
(void)m_free(m);
Index: net/rtsock.c
===
RCS file: /cvs/src/sys/net/rtsock.c,v
retrieving revision 1.220
diff -u -p -r1.220 rtsock.c
--- net/rtsock.c24 Jan 2017 00:17:14 -  1.220
+++ net/rtsock.c27 Jan 2017 18:08:26 -
@@ -277,12 +277,12 @@ route_ctloutput(int op, struct socket *s
case PRCO_GETOPT:
switch 

err with multiple TLS sites but one OCSP?

2017-01-27 Thread Michael W. Lucas
Hi,

Not sure if this is an expected part of OCSP or a bug.

I've configured two TLS sites on one host, one with OCSP stapling
(www3.mwlucas.org) and one without (www4.mwlucas.org). The OCSP site
works fine, but the non-OCSP site generates an err.

It *appears* that queries to the non-OCSP site return the OCSP site's
OCSP cert.

Following please find openssl queries on both. Feel free to check the
sites yourself, I'm FAR from a TLS guru.

# openssl s_client -connect www4.mwlucas.org:443 -status -servername 
www4.mwlucas.org
...
verify return:1
OCSP response:
==
OCSP Response Data:
OCSP Response Status: successful (0x0)
Response Type: Basic OCSP Response
Version: 1 (0x0)
Responder Id: C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
Produced At: Jan 26 23:02:00 2017 GMT
Responses:
Certificate ID:
  Hash Algorithm: sha1
  Issuer Name Hash: 7EE66AE7729AB3FCF8A220646C16A12D6071085D
  Issuer Key Hash: A84A6A63047DDDBAE6D139B7A64565EFF3A8ECA1
  Serial Number: 032CBDA721856F117CC7D57A72BBFA77B578
Cert Status: good
This Update: Jan 26 23:00:00 2017 GMT
Next Update: Feb  2 23:00:00 2017 GMT

Signature Algorithm: sha256WithRSAEncryption
 6a:1e:f1:44:8c:a9:a6:7e:40:25:3a:f7:50:e9:43:42:0f:74:
 9b:dc:ee:56:a3:47:0b:ce:73:88:ee:f0:84:fc:b0:25:5b:3d:
 67:d0:66:20:c7:60:7c:ee:26:91:72:4e:d0:f2:67:5a:e3:c1:
 06:57:31:47:29:1a:55:19:48:e7:e6:32:0b:18:d9:33:9d:55:
 d7:36:38:f1:96:57:bc:5d:89:82:31:bb:4e:12:0c:5c:ab:1a:
 f6:1d:a1:48:be:1c:1d:3b:52:a0:60:2f:1d:f9:3c:48:cd:df:
 a6:5e:b5:79:0c:b9:ed:d5:61:29:53:ee:83:5f:89:af:35:27:
 d6:94:05:f5:fb:d1:a8:4d:26:8d:8b:cf:e9:db:53:ad:e6:47:
 a7:db:91:9e:9d:a1:b2:2c:1e:d9:98:c5:af:5c:12:d1:04:5a:
 82:be:8d:80:1f:38:c2:5d:b1:6f:99:e1:ca:53:71:1c:85:0d:
 3e:f3:14:bc:3b:c9:c0:dd:6b:ec:59:d4:54:dc:fb:9c:da:72:
 91:45:61:55:69:e9:75:51:8f:e2:82:6a:dd:ec:bc:bd:3c:2c:
 92:43:f7:d9:65:1d:60:14:91:e0:b0:2b:46:25:49:35:74:99:
 71:a3:c0:d0:91:66:29:7e:01:1b:35:f1:2e:40:dc:f3:4d:98:
 69:40:6f:46


# openssl s_client -connect www3.mwlucas.org:443 -status -servername 
www3.mwlucas.org
CONNECTED(0003)
depth=2 O = Digital Signature Trust Co., CN = DST Root CA X3
verify return:1
depth=1 C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
verify return:1
depth=0 CN = www3.mwlucas.org
verify return:1
OCSP response:
==
OCSP Response Data:
OCSP Response Status: successful (0x0)
Response Type: Basic OCSP Response
Version: 1 (0x0)
Responder Id: C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3
Produced At: Jan 26 23:02:00 2017 GMT
Responses:
Certificate ID:
  Hash Algorithm: sha1
  Issuer Name Hash: 7EE66AE7729AB3FCF8A220646C16A12D6071085D
  Issuer Key Hash: A84A6A63047DDDBAE6D139B7A64565EFF3A8ECA1
  Serial Number: 032CBDA721856F117CC7D57A72BBFA77B578
Cert Status: good
This Update: Jan 26 23:00:00 2017 GMT
Next Update: Feb  2 23:00:00 2017 GMT

Signature Algorithm: sha256WithRSAEncryption
 6a:1e:f1:44:8c:a9:a6:7e:40:25:3a:f7:50:e9:43:42:0f:74:
 9b:dc:ee:56:a3:47:0b:ce:73:88:ee:f0:84:fc:b0:25:5b:3d:
 67:d0:66:20:c7:60:7c:ee:26:91:72:4e:d0:f2:67:5a:e3:c1:
 06:57:31:47:29:1a:55:19:48:e7:e6:32:0b:18:d9:33:9d:55:
 d7:36:38:f1:96:57:bc:5d:89:82:31:bb:4e:12:0c:5c:ab:1a:
 f6:1d:a1:48:be:1c:1d:3b:52:a0:60:2f:1d:f9:3c:48:cd:df:
 a6:5e:b5:79:0c:b9:ed:d5:61:29:53:ee:83:5f:89:af:35:27:
 d6:94:05:f5:fb:d1:a8:4d:26:8d:8b:cf:e9:db:53:ad:e6:47:
 a7:db:91:9e:9d:a1:b2:2c:1e:d9:98:c5:af:5c:12:d1:04:5a:
 82:be:8d:80:1f:38:c2:5d:b1:6f:99:e1:ca:53:71:1c:85:0d:
 3e:f3:14:bc:3b:c9:c0:dd:6b:ec:59:d4:54:dc:fb:9c:da:72:
 91:45:61:55:69:e9:75:51:8f:e2:82:6a:dd:ec:bc:bd:3c:2c:
 92:43:f7:d9:65:1d:60:14:91:e0:b0:2b:46:25:49:35:74:99:
 71:a3:c0:d0:91:66:29:7e:01:1b:35:f1:2e:40:dc:f3:4d:98:
 69:40:6f:46
==
...

==ml


-- 
Michael W. LucasTwitter @mwlauthor 
nonfiction: https://www.michaelwlucas.com/
fiction: https://www.michaelwarrenlucas.com/
blog: http://blather.michaelwlucas.com/



Re: add support for multiple transmit queues on interfaces

2017-01-27 Thread Simon Mages
I did some tests.

The performance did not change.
I think this is the expected behaviour.

BR
Simon

2017-01-23 7:35 GMT+01:00, David Gwynne :
> hrvoje popovski hit a problem where the kernel would panic under load.
>
> i mistakenly called an interfaces qstart routine directly from
> if_enqueue rather than via the ifq serializer. this meant that txeof
> routines on network cards calling ifq_restart would cause the start
> routine to run concurrently, therefore causing corruption of the
> ring state.
>
> this diff fixes that.
>
> On Mon, Jan 23, 2017 at 01:09:57PM +1000, David Gwynne wrote:
>> the short explanation is that this lets interfaces allocate multiple
>> ifq structures that can be mapped to their transmit rings. the
>> mechanism for this is a driver calling if_attach_queues() after
>> theyve called if_attach().
>>
>> the long version is that this has if_enqueue access an array of
>> ifqueues on the interface instead of if_snd directly. the ifq is
>> picked by asking the queue discipline (priq or hfsc) to map an mbuf
>> to a slot in the if_ifqs array.
>>
>> to notify the driver that a particular queue needs to start ive
>> added a new function pointer to ifnet called if_qstart. if_qstart
>> takes an ifqueue * as an argument instead of an ifnet *, thereby
>> getting past the implicit behaviour that interfaces only have a
>> single ring.
>>
>> our drivers all have if_start routines that take ifnet pointers
>> though, so there's compatability for those where a default if_qstart
>> implementation calls if_start for those drivers. in the future
>> if_start will be replaced with if_qstart and we can rename it back
>> to if_start. until then, there's compat.
>>
>> drivers that provide their own if_qstart instead of an if_start
>> function notify the stack by setting IFXF_MPSAFE. a chunk of this
>> diff is changing the IFXF_MPSAFE drivers to set if_qstart instead
>> of if_start. note that this is a mechanical change, it does not add
>> multiple tx queues to these drivers.
>>
>> most of this is straightforward except for the hfsc handling. hfsc
>> needs to track all flows going over an interface, which means all
>> flows have to be serialised through hfsc. the mechanism in use
>> before this change was to swap the priq backend on if_snd with the
>> hfsc backend. the trick with this diff is that we still do that,
>> ie, we only change the first ifqueue on an interface over to hfsc.
>> this works because we use the ifqops on the first ifq to map packets
>> to any of them. because the hfsc map function unconditionally maps
>> packets to the first ifq, all packets end up going through the one
>> hfsc structure we set up. the rest of the ifqs remain set up as
>> priq, but dont get used for sending packets after hfsc has been
>> enabled. if we ever add another ifqops backend, this will have to
>> be rethought. until then this is an elegant hack.
>>
>> a consequence of this change is that we the ifnet if_start function
>> should not be called anymore. this isnt true at the moment because
>> of things like net80211 and ppp. they both queue management packets
>> onto a separate queue, but those separate queues are dequeued and
>> processed in the interfaces start routine. if we want to mark wifi
>> and ppp drivers as mpsafe (or get rid of separate if_start and
>> if_qstart routines) this will have to change.
>>
>> the guts of this change are in if_enqueue and if_attach_queues.
>>
>> ok?
>>
>
> Index: arch/octeon/dev/if_cnmac.c
> ===
> RCS file: /cvs/src/sys/arch/octeon/dev/if_cnmac.c,v
> retrieving revision 1.61
> diff -u -p -r1.61 if_cnmac.c
> --- arch/octeon/dev/if_cnmac.c5 Nov 2016 05:14:18 -   1.61
> +++ arch/octeon/dev/if_cnmac.c23 Jan 2017 06:32:59 -
> @@ -138,7 +138,7 @@ int   octeon_eth_ioctl(struct ifnet *, u_l
>  void octeon_eth_watchdog(struct ifnet *);
>  int  octeon_eth_init(struct ifnet *);
>  int  octeon_eth_stop(struct ifnet *, int);
> -void octeon_eth_start(struct ifnet *);
> +void octeon_eth_start(struct ifqueue *);
>
>  int  octeon_eth_send_cmd(struct octeon_eth_softc *, uint64_t, uint64_t);
>  uint64_t octeon_eth_send_makecmd_w1(int, paddr_t);
> @@ -303,7 +303,7 @@ octeon_eth_attach(struct device *parent,
>   ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST;
>   ifp->if_xflags = IFXF_MPSAFE;
>   ifp->if_ioctl = octeon_eth_ioctl;
> - ifp->if_start = octeon_eth_start;
> + ifp->if_qstart = octeon_eth_start;
>   ifp->if_watchdog = octeon_eth_watchdog;
>   ifp->if_hardmtu = OCTEON_ETH_MAX_MTU;
>   IFQ_SET_MAXLEN(>if_snd, max(GATHER_QUEUE_SIZE, IFQ_MAXLEN));
> @@ -704,8 +704,6 @@ octeon_eth_ioctl(struct ifnet *ifp, u_lo
>   error = 0;
>   }
>
> - if_start(ifp);
> -
>   splx(s);
>   return (error);
>  }
> @@ -923,13 +921,14 @@ done:
>  }
>
>  void
> -octeon_eth_start(struct ifnet *ifp)
> +octeon_eth_start(struct 

Re: Scheduler ping-pong with preempt()

2017-01-27 Thread Simon Mages
Hi,

i did my usual tests.

current:
req/s: 3898.20
variance: 0.84

current+diff:
req/s: 3928.80
variance: 0.45

With this diff the messurements have been much more stable. The
variance of the req/s
messurements is now a lot smaller. Also the performance has increased.

For the bandwidth/s case this diff did not change much. The variance
for those messurements
was slightly decreased though.

Overall, nice work, this diff works for me :)

2017-01-24 4:35 GMT+01:00, Martin Pieuchot :
> Userland threads are preempt()'d when hogging a CPU or when processing
> an AST.  Currently when such a thread is preempted the scheduler looks
> for an idle CPU and puts it on its run queue.  That means the number of
> involuntary context switch often result in a migration.
>
> This is not a problem per se and one could argue that if another CPU
> is idle it makes sense to move.  However with the KERNEL_LOCK() moving
> to another CPU won't necessarily allows the preempt()'d thread to run.
> It's even worse, it increases contention.
>
> If you add to this behavior the fact that sched_choosecpu() prefers idle
> CPUs in a linear order, meaning CPU0 > CPU1 > .. > CPUN, you'll
> understand that the set of idle CPUs will change every time preempt() is
> called.
>
> I believe this behavior affects kernel threads by side effect, since
> the set of idle CPU changes every time a thread is preempted.  With this
> diff the 'softnet' thread didn't move on a 2 CPUs machine during simple
> benchmarks.  Without, it plays ping-pong between CPU.
>
> The goal of this diff is to reduce the number of migrations.  You
> can compare the value of 'sched_nomigrations' and 'sched_nmigrations'
> with and without it.
>
> As usual, I'd like to know what's the impact of this diff on your
> favorite benchmark.  Please test and report back.
>
> Index: kern/kern_sched.c
> ===
> RCS file: /cvs/src/sys/kern/kern_sched.c,v
> retrieving revision 1.44
> diff -u -p -r1.44 kern_sched.c
> --- kern/kern_sched.c 21 Jan 2017 05:42:03 -  1.44
> +++ kern/kern_sched.c 24 Jan 2017 03:08:23 -
> @@ -51,6 +51,8 @@ uint64_t sched_noidle;  /* Times we didn
>  uint64_t sched_stolen;   /* Times we stole proc from other cpus 
> */
>  uint64_t sched_choose;   /* Times we chose a cpu */
>  uint64_t sched_wasidle;  /* Times we came out of idle */
> +uint64_t sched_nvcsw;/* voluntary context switches */
> +uint64_t sched_nivcsw;   /* involuntary context switches */
>
>  #ifdef MULTIPROCESSOR
>  struct taskq *sbartq;
> Index: kern/kern_synch.c
> ===
> RCS file: /cvs/src/sys/kern/kern_synch.c,v
> retrieving revision 1.136
> diff -u -p -r1.136 kern_synch.c
> --- kern/kern_synch.c 21 Jan 2017 05:42:03 -  1.136
> +++ kern/kern_synch.c 24 Jan 2017 03:08:23 -
> @@ -296,6 +296,7 @@ sleep_finish(struct sleep_state *sls, in
>   if (sls->sls_do_sleep && do_sleep) {
>   p->p_stat = SSLEEP;
>   p->p_ru.ru_nvcsw++;
> + sched_nvcsw++;
>   SCHED_ASSERT_LOCKED();
>   mi_switch();
>   } else if (!do_sleep) {
> @@ -481,6 +482,7 @@ sys_sched_yield(struct proc *p, void *v,
>   p->p_stat = SRUN;
>   setrunqueue(p);
>   p->p_ru.ru_nvcsw++;
> + sched_nvcsw++;
>   mi_switch();
>   SCHED_UNLOCK(s);
>
> Index: kern/sched_bsd.c
> ===
> RCS file: /cvs/src/sys/kern/sched_bsd.c,v
> retrieving revision 1.43
> diff -u -p -r1.43 sched_bsd.c
> --- kern/sched_bsd.c  9 Mar 2016 13:38:50 -   1.43
> +++ kern/sched_bsd.c  24 Jan 2017 03:18:24 -
> @@ -302,6 +302,7 @@ yield(void)
>   p->p_stat = SRUN;
>   setrunqueue(p);
>   p->p_ru.ru_nvcsw++;
> + sched_nvcsw++;
>   mi_switch();
>   SCHED_UNLOCK(s);
>  }
> @@ -327,9 +328,12 @@ preempt(struct proc *newp)
>   SCHED_LOCK(s);
>   p->p_priority = p->p_usrpri;
>   p->p_stat = SRUN;
> +#if 0
>   p->p_cpu = sched_choosecpu(p);
> +#endif
>   setrunqueue(p);
>   p->p_ru.ru_nivcsw++;
> + sched_nivcsw++;
>   mi_switch();
>   SCHED_UNLOCK(s);
>  }
> Index: sys/sched.h
> ===
> RCS file: /cvs/src/sys/sys/sched.h,v
> retrieving revision 1.41
> diff -u -p -r1.41 sched.h
> --- sys/sched.h   17 Mar 2016 13:18:47 -  1.41
> +++ sys/sched.h   24 Jan 2017 02:10:41 -
> @@ -134,6 +134,9 @@ struct schedstate_percpu {
>  extern int schedhz;  /* ideally: 16 */
>  extern int rrticks_init; /* ticks per roundrobin() */
>
> +extern uint64_t sched_nvcsw; /* voluntary context switches */
> +extern uint64_t sched_nivcsw;/* involuntary context switches 
> */
> +
>  struct proc;
>  void schedclock(struct proc 

retire ip6protosw

2017-01-27 Thread Alexander Bluhm
Hi,

If I change the IPv4 pr_input function to the way IPv6 is implemented,
I can get rid of struct ip6protosw and some wrapper functions.  I
think it more consistent to have less different structures.

Most conversions are mechanical.  Where the IPv4 and IPv6 fucntions
were identical, I removed one of them.

The divert_input functions cannot be called anyway so I removed
them.

ok?

bluhm

Index: net/if_etherip.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_etherip.c,v
retrieving revision 1.13
diff -u -p -r1.13 if_etherip.c
--- net/if_etherip.c25 Jan 2017 17:34:31 -  1.13
+++ net/if_etherip.c26 Jan 2017 14:49:14 -
@@ -404,9 +404,10 @@ ip_etherip_output(struct ifnet *ifp, str
return ip_output(m, NULL, NULL, IP_RAWOUTPUT, NULL, NULL, 0);
 }
 
-void
-ip_etherip_input(struct mbuf *m, int off, int proto)
+int
+ip_etherip_input(struct mbuf **mp, int *offp, int proto)
 {
+   struct mbuf *m = *mp;
struct mbuf_list ml = MBUF_LIST_INITIALIZER();
struct etherip_softc *sc;
const struct ip *ip;
@@ -419,13 +420,13 @@ ip_etherip_input(struct mbuf *m, int off
if (ip->ip_p != IPPROTO_ETHERIP) {
m_freem(m);
ipstat_inc(ips_noproto);
-   return;
+   return IPPROTO_DONE;
}
 
if (!etherip_allow) {
m_freem(m);
etheripstat.etherip_pdrops++;
-   return;
+   return IPPROTO_DONE;
}
 
LIST_FOREACH(sc, _softc_list, sc_entry) {
@@ -452,26 +453,26 @@ ip_etherip_input(struct mbuf *m, int off
 * This is tricky but the path will be removed soon when
 * implementation of etherip is removed from gif(4).
 */
-   etherip_input(m, off, proto);
+   return etherip_input(mp, offp, proto);
 #else
etheripstat.etherip_noifdrops++;
m_freem(m);
+   return IPPROTO_DONE;
 #endif /* NGIF */
-   return;
}
 
-   m_adj(m, off);
+   m_adj(m, *offp);
m = m_pullup(m, sizeof(struct etherip_header));
if (m == NULL) {
etheripstat.etherip_adrops++;
-   return;
+   return IPPROTO_DONE;
}
 
eip = mtod(m, struct etherip_header *);
if (eip->eip_ver != ETHERIP_VERSION || eip->eip_pad) {
etheripstat.etherip_adrops++;
m_freem(m);
-   return;
+   return IPPROTO_DONE;
}
 
etheripstat.etherip_ipackets++;
@@ -482,7 +483,7 @@ ip_etherip_input(struct mbuf *m, int off
m = m_pullup(m, sizeof(struct ether_header));
if (m == NULL) {
etheripstat.etherip_adrops++;
-   return;
+   return IPPROTO_DONE;
}
m->m_flags &= ~(M_BCAST|M_MCAST);
 
@@ -492,6 +493,7 @@ ip_etherip_input(struct mbuf *m, int off
 
ml_enqueue(, m);
if_input(ifp, );
+   return IPPROTO_DONE;
 }
 
 #ifdef INET6
@@ -569,7 +571,6 @@ ip6_etherip_input(struct mbuf **mp, int 
 {
struct mbuf *m = *mp;
struct mbuf_list ml = MBUF_LIST_INITIALIZER();
-   int off = *offp;
struct etherip_softc *sc;
const struct ip6_hdr *ip6;
struct etherip_header *eip;
@@ -612,7 +613,7 @@ ip6_etherip_input(struct mbuf **mp, int 
 * This is tricky but the path will be removed soon when
 * implementation of etherip is removed from gif(4).
 */
-   return etherip_input6(mp, offp, proto);
+   return etherip_input(mp, offp, proto);
 #else
etheripstat.etherip_noifdrops++;
m_freem(m);
@@ -620,7 +621,7 @@ ip6_etherip_input(struct mbuf **mp, int 
 #endif /* NGIF */
}
 
-   m_adj(m, off);
+   m_adj(m, *offp);
m = m_pullup(m, sizeof(struct etherip_header));
if (m == NULL) {
etheripstat.etherip_adrops++;
@@ -652,10 +653,8 @@ ip6_etherip_input(struct mbuf **mp, int 
 
ml_enqueue(, m);
if_input(ifp, );
-
return IPPROTO_DONE;
 }
-
 #endif /* INET6 */
 
 int
Index: net/if_etherip.h
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/if_etherip.h,v
retrieving revision 1.3
diff -u -p -r1.3 if_etherip.h
--- net/if_etherip.h25 Jan 2017 17:34:31 -  1.3
+++ net/if_etherip.h26 Jan 2017 14:48:34 -
@@ -73,7 +73,7 @@ struct etherip_header {
 
 int ip_etherip_sysctl(int *, uint, void *, size_t *, void *, size_t);
 int ip_etherip_output(struct ifnet *, struct mbuf *);
-void ip_etherip_input(struct mbuf *, int, int);
+int ip_etherip_input(struct mbuf **, int *, int);
 
 #ifdef INET6
 int ip6_etherip_output(struct ifnet *, struct mbuf *);
Index: net/if_gif.c

Re: lld: fix library search

2017-01-27 Thread Patrick Wildt
On Fri, Jan 27, 2017 at 10:05:38PM +1000, Patrick Wildt wrote:
> Hi,
> 
> Apparently if you convert a twine to a string (using str() on the Twine)
> and setting that to a StringRef variable creates a broken string.  This
> means that the code never runs and thus never finds libc.so.x.y.  It
> falls back to picking up the .a files instead.
> 
> Apparently you have to provide a scratch variable and use that one to
> create a StringRef.  This fixes the issue that the binaries are not
> linked to any shared object.
> 
> Now my lld-linked lld doesn't segfault directly but throws this.  I do
> consider this progress:
> 
> # ./ld.lld
> ld.lld:/usr/lib/libc++.so.0.0: undefined symbol '_ZTISt9bad_alloc'
> ld.lld:/usr/lib/libc++.so.0.0: undefined symbol '_ZNSt9bad_allocD1Ev'
> ld.lld:/usr/lib/libc++.so.0.0: undefined symbol '_ZNSt9bad_allocC1Ev'
> ./ld.lld: error: no input files
> ./ld.lld: error: target emulation unknown: -m or at least one .o file required

I think that issue might be because libc++ uses something from libc++abi
but does not explicitly link to it.  And we provide libc++abi as an ".a"
only.

> 
> ok?
> 
> Patrick
> 
> diff --git a/gnu/llvm/tools/lld/ELF/DriverUtils.cpp 
> b/gnu/llvm/tools/lld/ELF/DriverUtils.cpp
> index 2c0b4405ba2..803b112120d 100644
> --- a/gnu/llvm/tools/lld/ELF/DriverUtils.cpp
> +++ b/gnu/llvm/tools/lld/ELF/DriverUtils.cpp
> @@ -184,7 +184,8 @@ Optional elf::searchLibrary(StringRef Name) {
>if (Optional S = findFile(Dir, "lib" + Name + ".so"))
>  return S;
>  
> -  const StringRef LibName = (Twine("lib") + Name + ".so.").str();
> +  llvm::SmallString<128> Scratch;
> +  const StringRef LibName = ("lib" + Name + ".so.").toStringRef(Scratch);
>int MaxMaj = -1, MaxMin = -1;
>std::error_code EC;
>for (fs::directory_iterator LI(Dir, EC), LE;



lld: fix library search

2017-01-27 Thread Patrick Wildt
Hi,

Apparently if you convert a twine to a string (using str() on the Twine)
and setting that to a StringRef variable creates a broken string.  This
means that the code never runs and thus never finds libc.so.x.y.  It
falls back to picking up the .a files instead.

Apparently you have to provide a scratch variable and use that one to
create a StringRef.  This fixes the issue that the binaries are not
linked to any shared object.

Now my lld-linked lld doesn't segfault directly but throws this.  I do
consider this progress:

# ./ld.lld
ld.lld:/usr/lib/libc++.so.0.0: undefined symbol '_ZTISt9bad_alloc'
ld.lld:/usr/lib/libc++.so.0.0: undefined symbol '_ZNSt9bad_allocD1Ev'
ld.lld:/usr/lib/libc++.so.0.0: undefined symbol '_ZNSt9bad_allocC1Ev'
./ld.lld: error: no input files
./ld.lld: error: target emulation unknown: -m or at least one .o file required

ok?

Patrick

diff --git a/gnu/llvm/tools/lld/ELF/DriverUtils.cpp 
b/gnu/llvm/tools/lld/ELF/DriverUtils.cpp
index 2c0b4405ba2..803b112120d 100644
--- a/gnu/llvm/tools/lld/ELF/DriverUtils.cpp
+++ b/gnu/llvm/tools/lld/ELF/DriverUtils.cpp
@@ -184,7 +184,8 @@ Optional elf::searchLibrary(StringRef Name) {
   if (Optional S = findFile(Dir, "lib" + Name + ".so"))
 return S;
 
-  const StringRef LibName = (Twine("lib") + Name + ".so.").str();
+  llvm::SmallString<128> Scratch;
+  const StringRef LibName = ("lib" + Name + ".so.").toStringRef(Scratch);
   int MaxMaj = -1, MaxMin = -1;
   std::error_code EC;
   for (fs::directory_iterator LI(Dir, EC), LE;