Re: arm64 MP

2018-02-23 Thread Otto Moerbeek
On Tue, Feb 20, 2018 at 08:50:14PM +0100, Otto Moerbeek wrote:

> On Tue, Feb 20, 2018 at 07:52:49PM +0100, Otto Moerbeek wrote:
> 
> > On Tue, Feb 20, 2018 at 08:58:47AM +0100, Otto Moerbeek wrote:
> > 
> > > On Tue, Feb 20, 2018 at 08:52:20AM +0100, Mark Kettenis wrote:
> > > 
> > > > > Date: Mon, 19 Feb 2018 13:49:48 +0100 (CET)
> > > > > From: Mark Kettenis 
> > > > > 
> > > > > The diff below attempts to make the arm64 pmap "mpsafe" and enables MP
> > > > > support.  This diff survived a full build on my Firefly-RK3399 board.
> > > > > It also seems to work on the Overdrive 1000.  It might even work on
> > > > > the Raspberry Pi 3.  I'd appreciate it if people could play with this
> > > > > on the Raspberry Pi and other arm64 hardware they have.
> > > > > 
> > > > > I tried to follow a strategy for making the pmap "mpsafe" that more
> > > > > closely resembles what we did for our other architectures, although I
> > > > > looked at Dale Rahn's diff as well.  It seems I fixed at least some
> > > > > bugs in the process.  But I'm sure there are bugs remaining.  I may
> > > > > even have introduced new ones.  So do expect this to crash and burn,
> > > > > or at least lock up.
> > > > > 
> > > > > I'm also interested in people testing the normal GENERIC kernel with
> > > > > this diff, especially building ports.  There are some changes in there
> > > > > that could affect non-MP as well.
> > > > 
> > > > Here is a slightly better diff that removes a potential lock ordering
> > > > problem, removes a redundant splvm and moves some code around to avoid
> > > > a level of indentation.  Gaining confidence, so I'm looking for ok's
> > > > now.
> > > 
> > > I have been running you diff on my rpi (together with jsg firmware 
> > > update).
> > > 
> > > My rpi has hung twice now while doing a make -j4 in the clang part of
> > > the tree.
> > > 
> > > The last top screen shows it using swap (around 400M) and has three
> > > c++ processes each with resident size around 200M.
> > > 
> > > I'll try this diff now.
> > 
> > Not much change with this diff compared to the first. The system
> > functions more or less, but I'm seeing:
> > 
> > - No processes being scheduled on CPU 0
> > - A hang on reboot
> > - slaacd is not table to configure the ipv6 correctly, the address
> > remains "tentative" and packets do not flow.
> > 
> > Switching to single user mode, init complains it cannot kill all
> > processes. It turns out the main slaacd process hangs and cannot be
> > killed (not even with kill -9).
> > 
> > All this does not happen running a non-MP kernel.
> > 
> > -Otto
> 
> A bit more info:
> 
> When displaying system processes with top, this is what I see:
> 
> 79636 root -2200K  121M onproc/0  - 0:05 99.71% idle0
>   519 otto  280 1040K 1848K onproc/3  - 0:01  0.05% top
> 58718 root  -500K  121M sleep/3   -   218:42  0.00% idle1
> 58489 root  -500K  121M onproc/1  -   218:37  0.00% idle3
> 41386 root  -500K  121M onproc/2  -   218:28  0.00% idle2
> 
> Compared to and amd64 system, it shows idle0 on cpu0 taking
> cpu time, but not actually accumulating it?
> 
> amd64 system:
> 
> 49951 root -2200K   10M onproc/0  -   406.8H  0.00% idle0
> 51048 root -2200K   10M onproc/3  -   406.4H  0.00% idle3
> 60166 root -2200K   10M idle  -   405.6H  0.00% idle1
>  9554 root -2200K   10M onproc/2  -   405.2H  0.00% idle2
> 
>   -Otto

With the latest commit to arm64:

http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/arch/arm64/dev/bcm2836_intr.c.diff?r1=1.6=1.7

the problems have gone and my rpi works as expected.

-Otto



[PATCH] Folding of Comment Lines in make/lowparse.c

2018-02-23 Thread William Ahern
The routine skip_empty_lines_and_read_char() is an optimization to skip over
blocks of comment lines. When it reads an unescaped '#' it uses the helper
routine skip_to_end_of_line(). But skip_to_end_of_line() doesn't fold lines
as it should and like its parent caller does. (See patch at end of message.)

I found this bug perusing the BearSSL source code.

  25 # The lines below are a horrible hack that nonetheless works. On a
  26 # "make" utility compatible with Single Unix v4 (this includes GNU and
  27 # BSD make), the '\' at the end of a command line counts as an escape
  28 # for the newline character, so the next line is still a comment.
  29 # However, Microsoft's nmake.exe (that comes with Visual Studio) does
  30 # not interpret the final '\' that way in a comment. The end result is
  31 # that when using nmake.exe, this will include "mk/Win.mk", whereas
  32 # GNU/BSD make will include "mk/Unix.mk".
  33 
  34 # \
  35 !ifndef 0 # \
  36 !include mk/NMake.mk # \
  37 !else
  38 .POSIX:
  39 include mk/SingleUnix.mk
  40 # Extra hack for OpenBSD make.
  41 ifndef: all
  42 0: all
  43 endif: all
  44 # \
  45 !endif

  -- https://bearssl.org/gitweb/?p=BearSSL;a=blob;f=Makefile;hb=9dc62112

It's definitely a bug. POSIX says,

  There are three kinds of comments: blank lines, empty lines, and a
   ( '#' ) and all following characters up to the first
  unescaped  character.

and

  When an escaped  (one preceded by a ) is found
  anywhere in the makefile except in a command line, an include line, or a
  line immediately preceding an include line, it shall be replaced, along
  with any leading white space on the following line, with a single .

Fortunately POSIX leaves unspecified what happens to an escaped newline
preceding an include line. My patch should be the end of the
matter unless there's some other bug lurking. Here's a test case,

  # \
  broken:;@echo broken
  fixed:;@echo fixed

and the patch,

Index: lowparse.c
===
RCS file: /cvs/src/usr.bin/make/lowparse.c,v
retrieving revision 1.35
diff -u -r1.35 lowparse.c
--- lowparse.c  21 Oct 2016 16:12:38 -  1.35
+++ lowparse.c  23 Feb 2018 20:02:59 -
@@ -247,20 +247,21 @@
 static int
 skip_to_end_of_line(void)
 {
-   if (current->F) {
-   if (current->end - current->ptr > 1)
-   current->ptr = current->end - 1;
-   if (*current->ptr == '\n')
-   return *current->ptr++;
-   return EOF;
-   } else {
-   int c;
+   int escaped = 0, c;
 
-   do {
-   c = read_char();
-   } while (c != '\n' && c != EOF);
-   return c;
+   while (EOF != (c = read_char())) {
+   if (escaped) {
+   if (c == '\n')
+   current->origin.lineno++;
+   escaped = 0;
+   } else if (c == '\\') {
+   escaped = 1;
+   } else if (c == '\n') {
+   break;
+   }
}
+
+   return c;
 }
 
 



Re: armv7 really isn't a strict-alignment architecture

2018-02-23 Thread Brandon Bergren
Apparently, the control bit only controls whether the cpu will automatically do 
fixups for word or smaller single load/store. Targeting an unaligned address 
with a *multiple* load/store instruction will still fault. Over in Linux land, 
there's this giant grody handler that has emulation for the faultable 
instructions ( 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/mm/alignment.c
 ) so they can pretend everything is normal and fine...

Somewhat related random blog post: 
https://jsolano.net/2012/09/06/arm-unaligned-data-access-and-floating-point-in-linux/

As per A3.2.1 in the architecture reference manual, the only thing the bit 
controls is whether the specific instructions that have hardware fixups (which 
isn't even half of the ones that can fault!) will do such fixups in the 
background. It's not a "magic bit that makes the cpu act as if it's fine with 
unaligned access", it just means you can get away with a slightly less complex 
fault handler. And the current fault handler boils down to shooting the 
offending process in the head.

Which seems to me like a valid thing to do.

On Fri, Feb 23, 2018, at 6:49 AM, Mark Kettenis wrote:
> > Date: Fri, 23 Feb 2018 12:11:23 +0100
> > From: "Boudewijn Dijkstra" 
> > 
> > Op Thu, 22 Feb 2018 22:51:12 +0100 schreef Mark Kettenis  
> > :
> > > I hate to loose yet another strict-alignment canary, but the reality
> > > is that the rest of the world assumes that armv7 supports unaligned
> > > access which means that compilers generate code that assumes this
> > > works when compiling code for armv7 and later (e.g. NEON code) and
> > > that hand-written assembler code assumes this works as well.
> > 
> > Both gcc and clang have -munaligned-access in effect by default on ARMv7.
> 
> That depends on the OS.  For OpenBSD -mno-unaligned-access is the
> default, at least for clang.  And our base gcc barely supported armv7
> and we configured it for some armv6 variant.
> 
> > So if CPU_CONTROL_AFLT_ENABLE is turned on, then all code should be  
> > compiled with -mno-unaligned-access.  I cannot find any use of this option  
> > in the source tree.  Am I missing something?
> 
> See above.
> 
> But even with -mno-unaligned-access clang generates code that doesn't
> work with the SCTLR.A set to 1.  The particular case I encountered was
> a floating-point or vector instruction that needs 8-byte alignment
> where the operand was only 4-byte aligned.  That's a bug in clang/llvm
> of course, but since both Linux and Darwin set SCTRL.A to 0 such bugs
> are simply not caught.
> 
> Cheers,
> 
> Mark
> 


-- 
  Brandon Bergren
  Technical Generalist



Re: armv7 really isn't a strict-alignment architecture

2018-02-23 Thread Mark Kettenis
> Date: Fri, 23 Feb 2018 12:11:23 +0100
> From: "Boudewijn Dijkstra" 
> 
> Op Thu, 22 Feb 2018 22:51:12 +0100 schreef Mark Kettenis  
> :
> > I hate to loose yet another strict-alignment canary, but the reality
> > is that the rest of the world assumes that armv7 supports unaligned
> > access which means that compilers generate code that assumes this
> > works when compiling code for armv7 and later (e.g. NEON code) and
> > that hand-written assembler code assumes this works as well.
> 
> Both gcc and clang have -munaligned-access in effect by default on ARMv7.

That depends on the OS.  For OpenBSD -mno-unaligned-access is the
default, at least for clang.  And our base gcc barely supported armv7
and we configured it for some armv6 variant.

> So if CPU_CONTROL_AFLT_ENABLE is turned on, then all code should be  
> compiled with -mno-unaligned-access.  I cannot find any use of this option  
> in the source tree.  Am I missing something?

See above.

But even with -mno-unaligned-access clang generates code that doesn't
work with the SCTLR.A set to 1.  The particular case I encountered was
a floating-point or vector instruction that needs 8-byte alignment
where the operand was only 4-byte aligned.  That's a bug in clang/llvm
of course, but since both Linux and Darwin set SCTRL.A to 0 such bugs
are simply not caught.

Cheers,

Mark



Re: armv7 really isn't a strict-alignment architecture

2018-02-23 Thread Boudewijn Dijkstra
Op Thu, 22 Feb 2018 22:51:12 +0100 schreef Mark Kettenis  
:

I hate to loose yet another strict-alignment canary, but the reality
is that the rest of the world assumes that armv7 supports unaligned
access which means that compilers generate code that assumes this
works when compiling code for armv7 and later (e.g. NEON code) and
that hand-written assembler code assumes this works as well.


Both gcc and clang have -munaligned-access in effect by default on ARMv7.

So if CPU_CONTROL_AFLT_ENABLE is turned on, then all code should be  
compiled with -mno-unaligned-access.  I cannot find any use of this option  
in the source tree.  Am I missing something?



I hit
yet another case of this while playing around with softfp.

The diff below stops the kernel from generating alignment faults as a
first step.

ok?


Index: arch/arm/arm/cpufunc.c
===
RCS file: /cvs/src/sys/arch/arm/arm/cpufunc.c,v
retrieving revision 1.52
diff -u -p -r1.52 cpufunc.c
--- arch/arm/arm/cpufunc.c  8 Sep 2017 05:36:51 -   1.52
+++ arch/arm/arm/cpufunc.c  22 Feb 2018 21:42:35 -
@@ -395,7 +395,6 @@ armv7_setup()
| CPU_CONTROL_AFE;
cpuctrl = CPU_CONTROL_MMU_ENABLE
-   | CPU_CONTROL_AFLT_ENABLE
| CPU_CONTROL_DC_ENABLE
| CPU_CONTROL_BPRD_ENABLE
| CPU_CONTROL_IC_ENABLE




--
Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/



Re: pf generic packet delay

2018-02-23 Thread Martin Pieuchot
On 23/02/18(Fri) 04:08, Henning Brauer wrote:
> * Martin Pieuchot  [2018-02-21 09:37]:
> > On 21/02/18(Wed) 02:37, Henning Brauer wrote:
> > I'd suggest moving the pool allocation and the function in net/pf*.c
> > and only have a function call under #if NPF > 0.
> 
> worth discussing, but imo that part doesn't really have all that much
> to do with pf.

It is only used by pf(4).  All the code is under #if NPF > 0, so it *is*
related to PF.  We have been reducing the #ifdef maze through the years
for maintainability reasons.  I don't see the point for reintroducing
them for taste.

> > I'd suggest defining your own structure containing a timeout and a mbuf
> > pointer instead of abusing ph_cookie.  Since you're already allocating
> > something it doesn't matter much and you're code won't be broken by
> > a future refactoring.
> 
> dlg pointed me to ph_cookie, I was about to use my own structure.
> "ph_cookie is for exactly that"

ph_cookie is not for that.  ph_cookie is a left-over because the wifi
stack abused `rcvif' to attach a node to management frames.  The problem
about using ph_cookie all over the place is that you don't know who owns
it.  Is it the driver like in wireless case?  Is it pf(4) like in your
case?  Is it a subsystem like in pipex case?

If you can easily avoid using ph_cookie, then please do it.  Otherwise
you're putting maintenance burden by writing fragile code that will
subtly break when somebody will start using ph_cookie for something else.