Re: pg_jobc going negative?

2020-06-08 Thread Mouse
> pg_jobc is a process group struct member counting the number of
> processes with [...]

> [W]e caught issues that pg_jobc can go negative.  [...]

> Is going negative in the first place a real bug?

I would say, yes.  A "count of processes with [condition]" cannot ever
truly be negative.

The only question is whether the bug is in code or in the idea that the
variable always accurately reflects that (conceptual) count.  (Okay,
there's also the question of whether it matters in practice, but in my
opinion that does not affect the question of whether it's a bug, only
whether the bug matters in practice.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


pg_jobc going negative?

2020-06-08 Thread Kamil Rytarowski
pg_jobc is a process group struct member counting the number of
processes with a parent capable of doing job control. Once reaching 0,
the process group is orphaned.

With the addition of asserts checking for pg_jobc > 0 (by maxv@), we
caught issues that pg_jobc can go negative. I have landed new ATF tests
that trigger this reliably with ptrace(2)
(src/tests/lib/libc/sys/t_ptrace_fork_wait.h r.1.7). The problem was
originally triggered with GDB.

There are also other sources of these asserts due to races
The ptrace(2) ATF tests are reliable triggering a negative pg_jobc,
however there are racy tests doing the same as triggered by syzkaller
(mentioned at least in [1]).

Should we allow pg_jobc going negative? (Other BSDs allow this.)

Is going negative in the first place a real bug?

Are the races triggered by syzkaller real bugs?

[1] http://mail-index.netbsd.org/current-users/2020/05/04/msg038511.html

On 20.04.2020 18:32, Maxime Villard wrote:
> Module Name:  src
> Committed By: maxv
> Date: Mon Apr 20 16:32:03 UTC 2020
> 
> Modified Files:
>   src/sys/kern: kern_proc.c
> 
> Log Message:
> Add three KASSERTs, to detect refcount bugs.
> 
> This narrows down an unknown bug in some place near, that has manifested
> itself in various forms (use-after-frees, uninit accesses, page faults,
> segmentation faults), all pointed out by syzbot.
> 
> The first KASSERT in fixjobc() fires when the bug is encountered.
> 
> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.244 -r1.245 src/sys/kern/kern_proc.c
> 
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.
> 
> 
> Modified files:
> 
> Index: src/sys/kern/kern_proc.c
> diff -u src/sys/kern/kern_proc.c:1.244 src/sys/kern/kern_proc.c:1.245
> --- src/sys/kern/kern_proc.c:1.244Sun Apr 19 20:31:59 2020
> +++ src/sys/kern/kern_proc.c  Mon Apr 20 16:32:03 2020
> @@ -1,4 +1,4 @@
> -/*   $NetBSD: kern_proc.c,v 1.244 2020/04/19 20:31:59 thorpej Exp $  */
> +/*   $NetBSD: kern_proc.c,v 1.245 2020/04/20 16:32:03 maxv Exp $ */
>  
>  /*-
>   * Copyright (c) 1999, 2006, 2007, 2008, 2020 The NetBSD Foundation, Inc.
> @@ -62,7 +62,7 @@
>   */
>  
>  #include 
> -__KERNEL_RCSID(0, "$NetBSD: kern_proc.c,v 1.244 2020/04/19 20:31:59 thorpej 
> Exp $");
> +__KERNEL_RCSID(0, "$NetBSD: kern_proc.c,v 1.245 2020/04/20 16:32:03 maxv Exp 
> $");
>  
>  #ifdef _KERNEL_OPT
>  #include "opt_kstack.h"
> @@ -554,6 +554,7 @@ proc_sessrele(struct session *ss)
>  {
>  
>   KASSERT(mutex_owned(proc_lock));
> + KASSERT(ss->s_count > 0);
>   /*
>* We keep the pgrp with the same id as the session in order to
>* stop a process being given the same pid.  Since the pgrp holds
> @@ -1181,8 +1182,11 @@ fixjobc(struct proc *p, struct pgrp *pgr
>   if (entering) {
>   pgrp->pg_jobc++;
>   p->p_lflag &= ~PL_ORPHANPG;
> - } else if (--pgrp->pg_jobc == 0)
> - orphanpg(pgrp);
> + } else {
> + KASSERT(pgrp->pg_jobc > 0);
> + if (--pgrp->pg_jobc == 0)
> + orphanpg(pgrp);
> + }
>   }
>  
>   /*
> @@ -1197,8 +1201,11 @@ fixjobc(struct proc *p, struct pgrp *pgr
>   if (entering) {
>   child->p_lflag &= ~PL_ORPHANPG;
>   hispgrp->pg_jobc++;
> - } else if (--hispgrp->pg_jobc == 0)
> - orphanpg(hispgrp);
> + } else {
> + KASSERT(hispgrp->pg_jobc > 0);
> + if (--hispgrp->pg_jobc == 0)
> + orphanpg(hispgrp);
> + }
>   }
>   }
>  }
> 




signature.asc
Description: OpenPGP digital signature


Re: pci_mapreg_map and BAR from `Bus space tutorial'

2020-06-08 Thread Mouse
>>> - FAA_MMREG_BAR may contain the beginning address of a space to be
>>>mapped. But how does `pci_mapreg_map' know the extension of such
>>>a map?
>> [...], the BAR contains all info needed (or should, if setup has
>> worked correctly).
> Oh, ok, so the BAR does not specify just an address: it also
> specifies the extension of the memory space.  I could not guess this.

Guessing should not be relevant; how BARs work is part of the PCI
definition.  I forget details, but I think things like "I/O mapping
versus memory mapping" are present in read-only bits at the LSB end of
the BAR.  NetBSD has a bunch of #defines for them; look for PCI_MAPREG_
in pcireg.h - or, at least, that's where they are on 1.4T, 5.2, and
8.0; while I don't have anything more recent at ready hand to check,
I'd guess that something that stable is probably going to stay stable.

>> Give this is mips I assume PCI_MAPREG_TYPE_MEM is correct, but for a
>> new device you may be unsure, and sometimes variants of devices
>> exist that either make the type IO or MEM.  You can query that in
>> your driver with pci_mapreg_info() - but for the concrete case the
>> userland pcictl output is easier.
> Consider that this is a custom PCI device, created by the author of
> the tutorial as statement in the sourcecode of gxemul [...]

In this particular case, sure; I suspect the person who wrote the
double-quoted text above was trying to give advice applicable beyond
the case at immediate hand.

> Looking into the output of pcictl(8), also `I/O space accesses: on'
> is on, but I don't know if it is significant.

PCI is defined to have two address spaces, called I/O space and memory
space.  On x86, these normally correspond to I/O instructions
(inb/outb/etc) and memory-mapped I/O, respectively.  On other machines,
this often works differently; some PCI attachments present PCI I/O
space as memory-mapped too, just in a different part of the address
space.  (I daresay this could be done on x86 too, if a chipset/board
maker wanted to, but it would render the hardware incompatible with a
lot of OS code, so it's not normally done.)

The line you cite

> Command register: 0x0003
>   I/O space accesses: on

is a PCI thing, indicating that the device is configured to respond to
a PCI access marked as an I/O space access, provided it's mapped by a
suitable mapping register.  If the hardware is something that doesn't
have separate I/O space instructions, then PCI I/O space probably turns
into something else from the CPU's point of view, like a different part
of the address space, or some other way of marking accesses as being in
a different address space (like SPARC's lda and sta instructions, which
take an address-space identifier as well as an address).

For a machine that already has a NetBSD port supporting PCI, those
details should already be handled for you by the bus_space
implementation for that port.  Your driver just needs to know "is this
a (PCI) I/O space mapping or a (PCI) memory space mapping?", and
sometimes not even that; what it turns into at the machine-code level
is hidden by the bus_space layer.  See pci_mapreg_type(),
pci_mapreg_map(), etc.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: pci_mapreg_map and BAR from `Bus space tutorial'

2020-06-08 Thread Rocky Hotas
Hello Martin,
sorry for the delay and thanks a lot for your answer.

On giu 02  7:03, Martin Husemann wrote:
 
> > - FAA_MMREG_BAR may contain the beginning address of a space to be
> >   mapped. But how does `pci_mapreg_map' know the extension of such a map?
> 
> It is just an offset (like a register number), the BAR contains all info
> needed (or should, if setup has worked correctly).

Oh, ok, so the BAR does not specify just an address: it also specifies
the extension of the memory space. I could not guess this.

> If you remove your driver you can list the details from userland with
> pcictl(8).
[...]
> Usually firmware (or early bootloaders) configure the BARs properly, but
> sometimes they don't, and then NetBSD needs to run fixup code (which is
> a kernel option and not included in most kernels). The output
> of pcictl would help here.

Ok! With or without the driver, the output of pcictl(8) is the same and
I attach it to this message. The device we are talking about is in bus
pci0, dev 12, fun 0, so I ran from a root prompt `pcictl pci0 dump -d 12'.

> Give this is mips I assume PCI_MAPREG_TYPE_MEM is correct, but for a new
> device you may be unsure, and sometimes variants of devices exist that
> either make the type IO or MEM. You can query that in your driver with
> pci_mapreg_info() - but for the concrete case the userland pcictl output is
> easier.

Consider that this is a custom PCI device, created by the author of the
tutorial as statement in the sourcecode of gxemul which defines the
`cobalt' machine.
The tutorial used PCI_MAPREG_TYPE_MEM. Looking into the output of pcictl(8),
also `I/O space accesses: on' is on, but I don't know if it is significant.

Bye!

Rocky
PCI configuration registers:
  Common header:
0x00: 0x0001fabc 0x0003 0x0b41 0x

Vendor ID: 0xfabc
Device ID: 0x0001
Command register: 0x0003
  I/O space accesses: on
  Memory space accesses: on
  Bus mastering: off
  Special cycles: off
  MWI transactions: off
  Palette snooping: off
  Parity error checking: off
  Address/data stepping: off
  System error (SERR): off
  Fast back-to-back transactions: off
  Interrupt disable: off
Status register: 0x
  Immediate Readiness: off
  Interrupt status: inactive
  Capability List support: off
  66 MHz capable: off
  User Definable Features (UDF) support: off
  Fast back-to-back capable: off
  Data parity error detected: off
  DEVSEL timing: fast (0x0)
  Slave signaled Target Abort: off
  Master received Target Abort: off
  Master received Master Abort: off
  Asserted System Error (SERR): off
  Parity error detected: off
Class Name: processor (0x0b)
Subclass Name: Co-processor (0x40)
Interface: 0x00
Revision ID: 0x01
BIST: 0x00
Header Type: 0x00 (0x00)
Latency Timer: 0x00
Cache Line Size: 0bytes (0x00)

  Type 0 ("normal" device) header:
0x10: 0x1011 0x 0x 0x
0x20: 0x 0x 0x 0x
0x30: 0x 0x 0x 0x

Base address register at 0x10
  type: 32-bit nonprefetchable memory
  base: 0x1011
Base address register at 0x14
  not implemented
Base address register at 0x18
  not implemented
Base address register at 0x1c
  not implemented
Base address register at 0x20
  not implemented
Base address register at 0x24
  not implemented
Cardbus CIS Pointer: 0x
Subsystem vendor ID: 0x
Subsystem ID: 0x
Expansion ROM Base Address Register: 0x
  base: 0x
  Expansion ROM Enable: off
  Validation Status: Validation not supported
  Validation Details: 0x0
Reserved @ 0x34: 0x
Reserved @ 0x38: 0x
Maximum Latency: 0x00
Minimum Grant: 0x00
Interrupt pin: 0x00 (none)
Interrupt line: 0x00

  Device-dependent header:
0x40: 0x 0x 0x 0x
0x50: 0x 0x 0x 0x
0x60: 0x 0x 0x 0x
0x70: 0x 0x 0x 0x
0x80: 0x 0x 0x 0x
0x90: 0x 0x 0x 0x
0xa0: 0x 0x 0x 0x
0xb0: 0x 0x 0x 0x
0xc0: 0x 0x 0x 0x
0xd0: 0x 0x 0x 0x
0xe0: 0x 0x 0x 0x
0xf0: 0x 0x 0x 0x