Re: pg_jobc going negative?
> pg_jobc is a process group struct member counting the number of > processes with [...] > [W]e caught issues that pg_jobc can go negative. [...] > Is going negative in the first place a real bug? I would say, yes. A "count of processes with [condition]" cannot ever truly be negative. The only question is whether the bug is in code or in the idea that the variable always accurately reflects that (conceptual) count. (Okay, there's also the question of whether it matters in practice, but in my opinion that does not affect the question of whether it's a bug, only whether the bug matters in practice.) /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
pg_jobc going negative?
pg_jobc is a process group struct member counting the number of processes with a parent capable of doing job control. Once reaching 0, the process group is orphaned. With the addition of asserts checking for pg_jobc > 0 (by maxv@), we caught issues that pg_jobc can go negative. I have landed new ATF tests that trigger this reliably with ptrace(2) (src/tests/lib/libc/sys/t_ptrace_fork_wait.h r.1.7). The problem was originally triggered with GDB. There are also other sources of these asserts due to races The ptrace(2) ATF tests are reliable triggering a negative pg_jobc, however there are racy tests doing the same as triggered by syzkaller (mentioned at least in [1]). Should we allow pg_jobc going negative? (Other BSDs allow this.) Is going negative in the first place a real bug? Are the races triggered by syzkaller real bugs? [1] http://mail-index.netbsd.org/current-users/2020/05/04/msg038511.html On 20.04.2020 18:32, Maxime Villard wrote: > Module Name: src > Committed By: maxv > Date: Mon Apr 20 16:32:03 UTC 2020 > > Modified Files: > src/sys/kern: kern_proc.c > > Log Message: > Add three KASSERTs, to detect refcount bugs. > > This narrows down an unknown bug in some place near, that has manifested > itself in various forms (use-after-frees, uninit accesses, page faults, > segmentation faults), all pointed out by syzbot. > > The first KASSERT in fixjobc() fires when the bug is encountered. > > > To generate a diff of this commit: > cvs rdiff -u -r1.244 -r1.245 src/sys/kern/kern_proc.c > > Please note that diffs are not public domain; they are subject to the > copyright notices on the relevant files. > > > Modified files: > > Index: src/sys/kern/kern_proc.c > diff -u src/sys/kern/kern_proc.c:1.244 src/sys/kern/kern_proc.c:1.245 > --- src/sys/kern/kern_proc.c:1.244Sun Apr 19 20:31:59 2020 > +++ src/sys/kern/kern_proc.c Mon Apr 20 16:32:03 2020 > @@ -1,4 +1,4 @@ > -/* $NetBSD: kern_proc.c,v 1.244 2020/04/19 20:31:59 thorpej Exp $ */ > +/* $NetBSD: kern_proc.c,v 1.245 2020/04/20 16:32:03 maxv Exp $ */ > > /*- > * Copyright (c) 1999, 2006, 2007, 2008, 2020 The NetBSD Foundation, Inc. > @@ -62,7 +62,7 @@ > */ > > #include > -__KERNEL_RCSID(0, "$NetBSD: kern_proc.c,v 1.244 2020/04/19 20:31:59 thorpej > Exp $"); > +__KERNEL_RCSID(0, "$NetBSD: kern_proc.c,v 1.245 2020/04/20 16:32:03 maxv Exp > $"); > > #ifdef _KERNEL_OPT > #include "opt_kstack.h" > @@ -554,6 +554,7 @@ proc_sessrele(struct session *ss) > { > > KASSERT(mutex_owned(proc_lock)); > + KASSERT(ss->s_count > 0); > /* >* We keep the pgrp with the same id as the session in order to >* stop a process being given the same pid. Since the pgrp holds > @@ -1181,8 +1182,11 @@ fixjobc(struct proc *p, struct pgrp *pgr > if (entering) { > pgrp->pg_jobc++; > p->p_lflag &= ~PL_ORPHANPG; > - } else if (--pgrp->pg_jobc == 0) > - orphanpg(pgrp); > + } else { > + KASSERT(pgrp->pg_jobc > 0); > + if (--pgrp->pg_jobc == 0) > + orphanpg(pgrp); > + } > } > > /* > @@ -1197,8 +1201,11 @@ fixjobc(struct proc *p, struct pgrp *pgr > if (entering) { > child->p_lflag &= ~PL_ORPHANPG; > hispgrp->pg_jobc++; > - } else if (--hispgrp->pg_jobc == 0) > - orphanpg(hispgrp); > + } else { > + KASSERT(hispgrp->pg_jobc > 0); > + if (--hispgrp->pg_jobc == 0) > + orphanpg(hispgrp); > + } > } > } > } > signature.asc Description: OpenPGP digital signature
Re: pci_mapreg_map and BAR from `Bus space tutorial'
>>> - FAA_MMREG_BAR may contain the beginning address of a space to be >>>mapped. But how does `pci_mapreg_map' know the extension of such >>>a map? >> [...], the BAR contains all info needed (or should, if setup has >> worked correctly). > Oh, ok, so the BAR does not specify just an address: it also > specifies the extension of the memory space. I could not guess this. Guessing should not be relevant; how BARs work is part of the PCI definition. I forget details, but I think things like "I/O mapping versus memory mapping" are present in read-only bits at the LSB end of the BAR. NetBSD has a bunch of #defines for them; look for PCI_MAPREG_ in pcireg.h - or, at least, that's where they are on 1.4T, 5.2, and 8.0; while I don't have anything more recent at ready hand to check, I'd guess that something that stable is probably going to stay stable. >> Give this is mips I assume PCI_MAPREG_TYPE_MEM is correct, but for a >> new device you may be unsure, and sometimes variants of devices >> exist that either make the type IO or MEM. You can query that in >> your driver with pci_mapreg_info() - but for the concrete case the >> userland pcictl output is easier. > Consider that this is a custom PCI device, created by the author of > the tutorial as statement in the sourcecode of gxemul [...] In this particular case, sure; I suspect the person who wrote the double-quoted text above was trying to give advice applicable beyond the case at immediate hand. > Looking into the output of pcictl(8), also `I/O space accesses: on' > is on, but I don't know if it is significant. PCI is defined to have two address spaces, called I/O space and memory space. On x86, these normally correspond to I/O instructions (inb/outb/etc) and memory-mapped I/O, respectively. On other machines, this often works differently; some PCI attachments present PCI I/O space as memory-mapped too, just in a different part of the address space. (I daresay this could be done on x86 too, if a chipset/board maker wanted to, but it would render the hardware incompatible with a lot of OS code, so it's not normally done.) The line you cite > Command register: 0x0003 > I/O space accesses: on is a PCI thing, indicating that the device is configured to respond to a PCI access marked as an I/O space access, provided it's mapped by a suitable mapping register. If the hardware is something that doesn't have separate I/O space instructions, then PCI I/O space probably turns into something else from the CPU's point of view, like a different part of the address space, or some other way of marking accesses as being in a different address space (like SPARC's lda and sta instructions, which take an address-space identifier as well as an address). For a machine that already has a NetBSD port supporting PCI, those details should already be handled for you by the bus_space implementation for that port. Your driver just needs to know "is this a (PCI) I/O space mapping or a (PCI) memory space mapping?", and sometimes not even that; what it turns into at the machine-code level is hidden by the bus_space layer. See pci_mapreg_type(), pci_mapreg_map(), etc. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: pci_mapreg_map and BAR from `Bus space tutorial'
Hello Martin, sorry for the delay and thanks a lot for your answer. On giu 02 7:03, Martin Husemann wrote: > > - FAA_MMREG_BAR may contain the beginning address of a space to be > > mapped. But how does `pci_mapreg_map' know the extension of such a map? > > It is just an offset (like a register number), the BAR contains all info > needed (or should, if setup has worked correctly). Oh, ok, so the BAR does not specify just an address: it also specifies the extension of the memory space. I could not guess this. > If you remove your driver you can list the details from userland with > pcictl(8). [...] > Usually firmware (or early bootloaders) configure the BARs properly, but > sometimes they don't, and then NetBSD needs to run fixup code (which is > a kernel option and not included in most kernels). The output > of pcictl would help here. Ok! With or without the driver, the output of pcictl(8) is the same and I attach it to this message. The device we are talking about is in bus pci0, dev 12, fun 0, so I ran from a root prompt `pcictl pci0 dump -d 12'. > Give this is mips I assume PCI_MAPREG_TYPE_MEM is correct, but for a new > device you may be unsure, and sometimes variants of devices exist that > either make the type IO or MEM. You can query that in your driver with > pci_mapreg_info() - but for the concrete case the userland pcictl output is > easier. Consider that this is a custom PCI device, created by the author of the tutorial as statement in the sourcecode of gxemul which defines the `cobalt' machine. The tutorial used PCI_MAPREG_TYPE_MEM. Looking into the output of pcictl(8), also `I/O space accesses: on' is on, but I don't know if it is significant. Bye! Rocky PCI configuration registers: Common header: 0x00: 0x0001fabc 0x0003 0x0b41 0x Vendor ID: 0xfabc Device ID: 0x0001 Command register: 0x0003 I/O space accesses: on Memory space accesses: on Bus mastering: off Special cycles: off MWI transactions: off Palette snooping: off Parity error checking: off Address/data stepping: off System error (SERR): off Fast back-to-back transactions: off Interrupt disable: off Status register: 0x Immediate Readiness: off Interrupt status: inactive Capability List support: off 66 MHz capable: off User Definable Features (UDF) support: off Fast back-to-back capable: off Data parity error detected: off DEVSEL timing: fast (0x0) Slave signaled Target Abort: off Master received Target Abort: off Master received Master Abort: off Asserted System Error (SERR): off Parity error detected: off Class Name: processor (0x0b) Subclass Name: Co-processor (0x40) Interface: 0x00 Revision ID: 0x01 BIST: 0x00 Header Type: 0x00 (0x00) Latency Timer: 0x00 Cache Line Size: 0bytes (0x00) Type 0 ("normal" device) header: 0x10: 0x1011 0x 0x 0x 0x20: 0x 0x 0x 0x 0x30: 0x 0x 0x 0x Base address register at 0x10 type: 32-bit nonprefetchable memory base: 0x1011 Base address register at 0x14 not implemented Base address register at 0x18 not implemented Base address register at 0x1c not implemented Base address register at 0x20 not implemented Base address register at 0x24 not implemented Cardbus CIS Pointer: 0x Subsystem vendor ID: 0x Subsystem ID: 0x Expansion ROM Base Address Register: 0x base: 0x Expansion ROM Enable: off Validation Status: Validation not supported Validation Details: 0x0 Reserved @ 0x34: 0x Reserved @ 0x38: 0x Maximum Latency: 0x00 Minimum Grant: 0x00 Interrupt pin: 0x00 (none) Interrupt line: 0x00 Device-dependent header: 0x40: 0x 0x 0x 0x 0x50: 0x 0x 0x 0x 0x60: 0x 0x 0x 0x 0x70: 0x 0x 0x 0x 0x80: 0x 0x 0x 0x 0x90: 0x 0x 0x 0x 0xa0: 0x 0x 0x 0x 0xb0: 0x 0x 0x 0x 0xc0: 0x 0x 0x 0x 0xd0: 0x 0x 0x 0x 0xe0: 0x 0x 0x 0x 0xf0: 0x 0x 0x 0x