Re: vmd(8): fix vmctl wait state corruption

2021-04-24 Thread Dave Voutila


Dave Voutila writes:

> Dave Voutila writes:
>
>> vmd(8) users of tech@,
>>
>> NOTE: I have no intention to try to commit this prior to 6.9's release
>> due to its complexity, but I didn't want to "wait" to solicit testers or
>> potential feedback.
>
> Freeze is over, so bumping this thread with an updated diff below.
>

Now that there's been some testing and snaps are building once again,
anyone willing to review & OK?

>>
>> I noticed recently that I could not have two vmctl(8) clients "wait" for
>> the same vm to shutdown as one would cancel the other. Worse yet, if you
>> cancel a wait (^C) you can effectively corrupt the state being used for
>> tracking the waiting client preventing future clients from waiting on
>> the vm.
>>
>> It turns out the socket fd of the vmctl(8) client is being sent by the
>> control process as the peerid in the imsg. This fd is being stored on
>> the vmd_vm structure in the vm_peerid member, but this fd only has
>> meaning in the scope of the control process. Consequently:
>>
>> - only 1 value can be stored at a time, meaning only 1 waiting client
>>   can exist at a time
>> - since vm_peerid is used for storing if another vmd(8) process is
>>   waiting on the vm, vm_peerid can be corrupted by vmctl(8)
>> - the control process cannot update waiting state on vmctl disconnects
>>   and since fd's are reused it's possible the message could be sent to a
>>   vmctl(8) client performing an operation other than a "wait"
>>
>> The below diff:
>>
>> 1. enables support for multiple vmctl(8) clients to wait on the same vm
>>to terminate
>> 2. keeps the wait state in the control process and out of the parent's
>>global vm state, tracking waiting parties in a TAILQ
>> 3. removes the waiting client state on client disconnect/cancellation
>> 4. simplifies vmd(8) by removing IMSG_VMDOP_WAIT_VM_REQUEST handling
>>from the vmm process, which isn't needed (and was partially
>>responsible for the corruption)
>>
>
> Above design still stands, but I've fixed some messaging issues related
> to the fact the parent process was forwarding
> IMSG_VMDOP_TERMINATE_VM_RESPONSE messages directly to the control
> process resulting in duplicate messages. This broke doing a `vmctl stop`
> for all vms (-a) and waiting (-w). It now only forwards errors.
>
>> There are some subsequent tweaks that may follow this diff, specifically
>> one related to the fact I've switched the logic to send
>> IMSG_VMDOP_TERMINATE_VM_EVENT messages to the control process (which
>> makes sense to me) but control relays a IMSG_VMDOP_TERMINATE_VM_RESPONSE
>> message to the waiting vmctl(8) client. I'd need to update vmctl(8) to
>> look for the other event and don't want to complicate the diff further.
>>
>> If any testers out there can try to break this for me it would be much
>> appreciated. :-)
>>
>
> Testers? I'd like to give people a few days to kick the tires before
> asking for OK to commit.
>
> -dv
>
>
> Index: control.c
> ===
> RCS file: /cvs/src/usr.sbin/vmd/control.c,v
> retrieving revision 1.34
> diff -u -p -r1.34 control.c
> --- control.c 20 Apr 2021 21:11:56 -  1.34
> +++ control.c 21 Apr 2021 17:17:04 -
> @@ -41,6 +41,13 @@
>
>  struct ctl_connlist ctl_conns = TAILQ_HEAD_INITIALIZER(ctl_conns);
>
> +struct ctl_notify {
> + int ctl_fd;
> + uint32_tctl_vmid;
> + TAILQ_ENTRY(ctl_notify) entry;
> +};
> +TAILQ_HEAD(ctl_notify_q, ctl_notify) ctl_notify_q =
> + TAILQ_HEAD_INITIALIZER(ctl_notify_q);
>  void
>control_accept(int, short, void *);
>  struct ctl_conn
> @@ -78,7 +85,10 @@ int
>  control_dispatch_vmd(int fd, struct privsep_proc *p, struct imsg *imsg)
>  {
>   struct ctl_conn *c;
> + struct ctl_notify   *notify = NULL, *notify_next;
>   struct privsep  *ps = p->p_ps;
> + struct vmop_result   vmr;
> + int  waiting = 0;
>
>   switch (imsg->hdr.type) {
>   case IMSG_VMDOP_START_VM_RESPONSE:
> @@ -86,11 +96,11 @@ control_dispatch_vmd(int fd, struct priv
>   case IMSG_VMDOP_SEND_VM_RESPONSE:
>   case IMSG_VMDOP_RECEIVE_VM_RESPONSE:
>   case IMSG_VMDOP_UNPAUSE_VM_RESPONSE:
> - case IMSG_VMDOP_TERMINATE_VM_RESPONSE:
>   case IMSG_VMDOP_GET_INFO_VM_DATA:
>   case IMSG_VMDOP_GET_INFO_VM_END_DATA:
>   case IMSG_CTL_FAIL:
>   case IMSG_CTL_OK:
> + /* Provide basic response back to a specific control client */
>   if ((c = control_connbyfd(imsg->hdr.peerid)) == NULL) {
>   log_warnx("%s: lost control connection: fd %d",
>   __func__, imsg->hdr.peerid);
> @@ -99,6 +109,61 @@ control_dispatch_vmd(int fd, struct priv
>   imsg_compose_event(>iev, imsg->hdr.type,
>   0, 0, imsg->fd, imsg->data, IMSG_DATA_SIZE(imsg));
>   break;
> + case IMSG_VMDOP_TERMINATE_VM_RESPONSE:

Re: Stop/unstop process & xsig

2021-04-24 Thread Martin Pieuchot
On 24/04/21(Sat) 12:49, Mark Kettenis wrote:
> > Date: Sat, 24 Apr 2021 12:23:17 +0200
> > From: Martin Pieuchot 
> > 
> > On 20/03/21(Sat) 13:25, Martin Pieuchot wrote:
> > > Diff below refactors routines to stop/unstop processes and save the signal
> > > number which will/can be transmitted it in wait4(2).  It does the 
> > > following:
> > > 
> > > - Move the "hack" involving P_SINTR to avoid grabbing the SCHED_LOCK()
> > >   recursively inside proc_stop().
> > > 
> > > - Introduce proc_unstop(), the symmetric routine to proc_stop().
> > > 
> > > - Manipulate `ps_xsig' only in proc_stop/unstop().
> > > 
> > > Ok?
> > 
> > Anyone?
> 
> This is not ok...
> 
> > 
> > > Index: kern/kern_sig.c
> > > ===
> > > RCS file: /cvs/src/sys/kern/kern_sig.c,v
> > > retrieving revision 1.278
> > > diff -u -p -r1.278 kern_sig.c
> > > --- kern/kern_sig.c   12 Mar 2021 10:13:28 -  1.278
> > > +++ kern/kern_sig.c   20 Mar 2021 12:16:51 -
> > > @@ -124,7 +124,7 @@ const int sigprop[NSIG + 1] = {
> > >  
> > >  void setsigvec(struct proc *, int, struct sigaction *);
> > >  
> > > -void proc_stop(struct proc *p, int);
> > > +int proc_stop(struct proc *p, int, int);
> > >  void proc_stop_sweep(void *);
> > >  void *proc_stop_si;
> > >  
> > > @@ -1061,8 +1061,7 @@ ptsignal(struct proc *p, int signum, enu
> > >   if (pr->ps_flags & PS_PPWAIT)
> > >   goto out;
> > >   atomic_clearbits_int(siglist, mask);
> > > - pr->ps_xsig = signum;
> > > - proc_stop(p, 0);
> > > + proc_stop(p, signum, 0);
> > >   goto out;
> > >   }
> > >   /*
> > > @@ -1170,17 +1169,12 @@ out:
> > >   *
> > >   *   while (signum = cursig(curproc))
> > >   *   postsig(signum);
> > > - *
> > > - * Assumes that if the P_SINTR flag is set, we're holding both the
> > > - * kernel and scheduler locks.
> > >   */
> > >  int
> > >  cursig(struct proc *p)
> > >  {
> > >   struct process *pr = p->p_p;
> > >   int sigpending, signum, mask, prop;
> > > - int dolock = (p->p_flag & P_SINTR) == 0;
> > > - int s;
> > >  
> > >   KERNEL_ASSERT_LOCKED();
> > >  
> > > @@ -1217,31 +1211,22 @@ cursig(struct proc *p)
> > >*/
> > >   if (((pr->ps_flags & (PS_TRACED | PS_PPWAIT)) == PS_TRACED) &&
> > >   signum != SIGKILL) {
> > > - pr->ps_xsig = signum;
> > >  
> > >   single_thread_set(p, SINGLE_SUSPEND, 0);
> > > -
> > > - if (dolock)
> > > - SCHED_LOCK(s);
> > > - proc_stop(p, 1);
> > > - if (dolock)
> > > - SCHED_UNLOCK(s);
> > > -
> > > + signum = proc_stop(p, signum, 1);
> 
> At this point the process will sleep since proc_stop() calls
> mi_switch().  At that point the debugger may clear or change the value
> of ps_xsig.

Indeed, for that exact reason `ps_xsig' is read at the end of
proc_stop() once the thread returned from mi_switch().  Are you saying
this is not enough?

It is similar to move the assignment below just before the test, no?  My
understanding is that this is safe because all this code is currently
executed under a lock common to all the threads.

> 
> > >   single_thread_clear(p, 0);
> > >  
> > >   /*
> > >* If we are no longer being traced, or the parent
> > >* didn't give us a signal, look for more signals.
> > >*/
> > > - if ((pr->ps_flags & PS_TRACED) == 0 ||
> > > - pr->ps_xsig == 0)
> > > + if ((pr->ps_flags & PS_TRACED) == 0 || signum == 0)
> > >   continue;
> 
> So this change is wrong.
> 
> > >  
> > >   /*
> > >* If the new signal is being masked, look for other
> > >* signals.
> > >*/
> > > - signum = pr->ps_xsig;
> 
> So this assignment is *not* redundant.
> 
> > >   mask = sigmask(signum);
> > >   if ((p->p_sigmask & mask) != 0)
> > >   continue;
> > > @@ -1286,12 +1271,7 @@ cursig(struct proc *p)
> > >   (pr->ps_pgrp->pg_jobc == 0 &&
> > >   prop & SA_TTYSTOP))
> > >   break;  /* == ignore */
> > > - pr->ps_xsig = signum;
> > > - if (dolock)
> > > - SCHED_LOCK(s);
> > > - proc_stop(p, 1);
> > > - if (dolock)
> > > - SCHED_UNLOCK(s);
> > > + proc_stop(p, signum, 1);
> > >   break;
> > >   } else if (prop & SA_IGNORE) {
> > 

Fwd: Fwd: umm_map returns unaligned address?

2021-04-24 Thread Alessandro Pistocchi
forwarding here so that attachments are not removed

-- Forwarded message -
From: Alessandro Pistocchi 
Date: Sat, Apr 24, 2021 at 3:50 AM
Subject: Re: Fwd: umm_map returns unaligned address?
To: Theo de Raadt , Philip Guenther ,
OpenBSD misc 


Hi,
apologies again.

I am not familiar with cvs so I have used the diff command between an
original sys folder and mine.

I am attaching the diff file and two files to be put in sys/kern. Sorry but
the diff
command did not include the content of the two files in the diff itself.

Please notice that this is not code that is intended to be put into the
openbsd kernel,
as it would probably not be interesting to the general users.

What I am doing here is to have a syscall that stops the scheduler on 3 of
the 4 cores,
runs my own code on those while the calling process runs on the other core
and restarts
scheduling on the 3 cores when the process terminates.
The shared memory is used for the 3 cores to communicate with the process.
I know that it might sound crazy to you but it does make a lot of sense for
what
I am trying to achieve.

Regarding the mapping, I hope that it works well enough even with
reordering issues and other stuff.
It does for the example I am sending and I don't need much more than that.

What I was flagging is just that sometimes uvm_map returns an address that
is not
aligned to PAGE_SIZE ( I printed it out and it has 0x004 in the lower 12
bits). On the
other hand uvm_unmap has an assertion that panics if the address passed to
it is not
page aligned. I believe that there could be a bug somewhere.

In previous runs I was getting constantly an address which is not aligned
but now
I get an address that is aligned.

Best :-)
Alessandro

On Sat, Apr 24, 2021 at 2:32 AM Theo de Raadt  wrote:

> Alessandro Pistocchi  wrote:
>
> > During the syscall I allocate some memory that I want to share between
> the
> > kernel and the calling process.
>
> When you get the mapping working, it will not work as well as you like.
>
> Compiler re-ordering of writes & reads, caches, write buffers, and other
> details such as non-TSO will creat problems, meaning you'll need very
> careful
> data-structure, and great care with co-dependent fields, otherwise one side
> can fool the other.
>
> That is why the copyin/copyout approach is used as a high-level
> serializing-forcing
> primitive in most operating systems.
>
> But since you don't show your whole diff.. I suspect this is a waste of
> time.
>
struct uvm_object *game_uvm_object;

void game_mode_create_shared_memory_region(struct proc *p) {
	game_uvm_object = uao_create(game_memory_size, 0);
	if(!game_uvm_object) return;

	// TODO(ale): make sure that this memory cannot be swapped out

	uao_reference(game_uvm_object);
	if(uvm_map(kernel_map, (vaddr_t *)_memory, round_page(game_memory_size), game_uvm_object,
		0, 0, UVM_MAPFLAG(PROT_READ | PROT_WRITE, PROT_READ | PROT_WRITE,
		MAP_INHERIT_SHARE, MADV_NORMAL, 0))) {
		uao_detach(game_uvm_object);
		game_uvm_object = 0;
		return;
	}

	uao_reference(game_uvm_object);
	if(uvm_map(>p_vmspace->vm_map, (vaddr_t *)_memory_in_proc_space, round_page(game_memory_size), game_uvm_object,
		0, 0, UVM_MAPFLAG(PROT_READ | PROT_WRITE, PROT_READ | PROT_WRITE,
		MAP_INHERIT_NONE, MADV_NORMAL, 0))) {
		// TODO(ale): the kernel mapping returns an address for game_memory that is 4 bytes more than page aligned. This panics.
		uvm_unmap(kernel_map, trunc_page((vaddr_t)game_memory), round_page((vaddr_t)game_memory + game_memory_size));
		game_memory = 0;
		uao_detach(game_uvm_object);
		uao_detach(game_uvm_object);
		game_uvm_object = 0;
		return;
	}

	game_mode_proc = p;

}

void game_mode_free_shared_memory_region(void) {
	if(game_memory_in_proc_space)uvm_unmap(_mode_proc->p_vmspace->vm_map, trunc_page((vaddr_t)game_memory_in_proc_space),
		round_page((vaddr_t)game_memory_in_proc_space + game_memory_size));
	// TODO(ale): the kernel mapping returns an address for game_memory that is 4 bytes more than page aligned. This panics.
	if(game_memory)uvm_unmap(kernel_map, trunc_page((vaddr_t)game_memory), round_page((vaddr_t)game_memory + game_memory_size));
	if(game_uvm_object){
		uao_detach(game_uvm_object);
		uao_detach(game_uvm_object);
	}
	game_memory_in_proc_space = game_memory = 0;
	game_uvm_object = 0;
}

// static inline void *phisical_from_virtual(struct proc *p, void *va) {
// 	paddr_t pa;
// 	vaddr_t faddr = trunc_page((vaddr_t)va), off = ((vaddr_t)va) - faddr;
// 	if (pmap_extract(vm_map_pmap(>p_vmspace->vm_map), faddr, ) == FALSE) panic("vmapbuf: null page frame");
// 	return (void *)(pa+(paddr_t)off);
// }
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include  
#include 

#define VBLANK_THREAD 		1
#define AUDIO_THREAD 		2
#define INPUT_THREAD 		3

void *game_memory;
void *game_memory_in_proc_space;
vsize_t game_memory_size;

void sched_start_secondary_cpus(void);
void 

Frequency Scaling Userspace Exposure

2021-04-24 Thread Al Poole
Is it feasible to introduce a min and max frequency exposed to userspace?

Understand the reason behind using hw.perf percentages.

As an application developer, I'd find this very useful.

Just a thought.

Thanks.


Re: Unlock top part of uvm_fault()

2021-04-24 Thread Christian Weisgerber
Christian Weisgerber:

> > Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> > both amd64 and sparc64.  That means the kernel lock will only be taken
> > for lower faults and some amap/anon code will now run without it.
> 
> I ran an amd64 bulk build with this diff on top of 6.9.

PS: 4 machines, 4 cores each (Xeon CPU E3-1270 v6, HTT disabled)

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: Unlock top part of uvm_fault()

2021-04-24 Thread Stuart Henderson
On 2021/04/22 15:38, Martin Pieuchot wrote:
> Diff below remove the KERNEL_LOCK()/UNLOCK() dance from uvm_fault() for
> both amd64 and sparc64.  That means the kernel lock will only be taken
> for lower faults and some amap/anon code will now run without it.
> 
> I'd be interested to have this tested and see how much does that impact
> the build time of packages.
> 
> We should be able to do the switch on an arch-by-arch basis.  It's
> easier for me to develop & debug on these two architectures so I started
> with them.  If you want to unlock another architecture and report back,
> I'd be glad.

i386 with the below diff has survived a ports bulk build; machines are
quad core but I only run builds on 3 concurrent to keep a cap on RAM use.

Hard to say if there's any change in build time as they're a bit variable
anyway; seems to be not much change either way. Perhaps this is more
important with larger numbers of cores though.

Index: arch/i386/i386/trap.c
===
RCS file: /cvs/src/sys/arch/i386/i386/trap.c,v
retrieving revision 1.151
diff -u -p -r1.151 trap.c
--- arch/i386/i386/trap.c   27 Oct 2020 19:17:12 -  1.151
+++ arch/i386/i386/trap.c   22 Apr 2021 20:20:58 -
@@ -126,10 +126,7 @@ upageflttrap(struct trapframe *frame, ui
union sigval sv;
int signal, sicode, error;
 
-   KERNEL_LOCK();
error = uvm_fault(>p_vmspace->vm_map, va, 0, access_type);
-   KERNEL_UNLOCK();
-
if (error == 0) {
uvm_grow(p, va);
return 1;
@@ -203,9 +200,7 @@ kpageflttrap(struct trapframe *frame, ui
if (curcpu()->ci_inatomic == 0 || map == kernel_map) {
onfault = pcb->pcb_onfault;
pcb->pcb_onfault = NULL;
-   KERNEL_LOCK();
error = uvm_fault(map, va, 0, access_type);
-   KERNEL_UNLOCK();
pcb->pcb_onfault = onfault;
 
if (error == 0 && map != kernel_map)



Re: uvm_km_kmemalloc{,_pla}()

2021-04-24 Thread Martin Pieuchot
On 24/04/21(Sat) 12:25, Mark Kettenis wrote:
> > Date: Sat, 24 Apr 2021 12:02:22 +0200
> > From: Martin Pieuchot 
> > 
> > Diff below merge the two allocators into one and remove unused
> > alignment/boundary arguments.  This is a small cleanup that helps
> > me keep track of the remaining allocators.
> > 
> > ok?
> 
> Not sure.  Is uvm_km_kmemalloc() going to be replaced in the future?

I don't know yet.

> At the very least, you need to adjust the man page.

Sure.

> > Index: arch/arm/arm/pmap7.c
> > ===
> > RCS file: /cvs/src/sys/arch/arm/arm/pmap7.c,v
> > retrieving revision 1.61
> > diff -u -p -r1.61 pmap7.c
> > --- arch/arm/arm/pmap7.c25 Mar 2021 04:12:00 -  1.61
> > +++ arch/arm/arm/pmap7.c24 Apr 2021 09:53:11 -
> > @@ -2435,8 +2435,9 @@ pmap_bootstrap_pv_page_alloc(struct pool
> > return (rv);
> > }
> >  
> > -   new_page = uvm_km_kmemalloc(kernel_map, NULL, PAGE_SIZE,
> > -   (flags & PR_WAITOK) ? 0 : UVM_KMF_NOWAIT);
> > +   new_page = uvm_km_kmemalloc(kernel_map, NULL, PAGE_SIZE, 0,
> > +   (flags & PR_WAITOK) ? 0 : UVM_KMF_NOWAIT, no_constraint.ucr_low,
> > +   no_constraint.ucr_high, 0);
> >  
> > last_bootstrap_page = new_page;
> > return ((void *)new_page);
> > Index: dev/pci/envy.c
> > ===
> > RCS file: /cvs/src/sys/dev/pci/envy.c,v
> > retrieving revision 1.81
> > diff -u -p -r1.81 envy.c
> > --- dev/pci/envy.c  5 Jan 2020 01:07:58 -   1.81
> > +++ dev/pci/envy.c  24 Apr 2021 09:53:12 -
> > @@ -1834,9 +1834,9 @@ envy_allocm(void *self, int dir, size_t 
> >  #define ENVY_ALIGN 4
> >  #define ENVY_MAXADDR   ((1 << 28) - 1)
> >  
> > -   buf->addr = (caddr_t)uvm_km_kmemalloc_pla(kernel_map,
> > +   buf->addr = (caddr_t)uvm_km_kmemalloc(kernel_map,
> > uvm.kernel_object, buf->size, 0, UVM_KMF_NOWAIT, 0,
> > -   (paddr_t)ENVY_MAXADDR, 0, 0, 1);
> > +   (paddr_t)ENVY_MAXADDR, 1);
> > if (buf->addr == NULL) {
> > DPRINTF("%s: unable to alloc dma segment\n", DEVNAME(sc));
> > goto err_ret;
> > Index: dev/pci/if_bce.c
> > ===
> > RCS file: /cvs/src/sys/dev/pci/if_bce.c,v
> > retrieving revision 1.53
> > diff -u -p -r1.53 if_bce.c
> > --- dev/pci/if_bce.c10 Jul 2020 13:22:20 -  1.53
> > +++ dev/pci/if_bce.c24 Apr 2021 09:53:12 -
> > @@ -253,9 +253,9 @@ bce_attach(struct device *parent, struct
> > bce_reset(sc);
> >  
> > /* Create the data DMA region and maps. */
> > -   if ((sc->bce_data = (caddr_t)uvm_km_kmemalloc_pla(kernel_map,
> > +   if ((sc->bce_data = (caddr_t)uvm_km_kmemalloc(kernel_map,
> > uvm.kernel_object, (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES, 0,
> > -   UVM_KMF_NOWAIT, 0, (paddr_t)(0x4000 - 1), 0, 0, 1)) == NULL) {
> > +   UVM_KMF_NOWAIT, 0, (paddr_t)(0x4000 - 1), 1)) == NULL) {
> > printf(": unable to alloc space for ring");
> > return;
> > }
> > Index: kern/kern_malloc.c
> > ===
> > RCS file: /cvs/src/sys/kern/kern_malloc.c,v
> > retrieving revision 1.145
> > diff -u -p -r1.145 kern_malloc.c
> > --- kern/kern_malloc.c  21 Apr 2021 10:02:05 -  1.145
> > +++ kern/kern_malloc.c  24 Apr 2021 09:53:12 -
> > @@ -228,12 +228,12 @@ malloc(size_t size, int type, int flags)
> > mtx_leave(_mtx);
> > npg = atop(round_page(allocsize));
> > s = splvm();
> > -   va = (caddr_t)uvm_km_kmemalloc_pla(kmem_map, NULL,
> > +   va = (caddr_t)uvm_km_kmemalloc(kmem_map, NULL,
> > (vsize_t)ptoa(npg), 0,
> > ((flags & M_NOWAIT) ? UVM_KMF_NOWAIT : 0) |
> > ((flags & M_CANFAIL) ? UVM_KMF_CANFAIL : 0),
> > no_constraint.ucr_low, no_constraint.ucr_high,
> > -   0, 0, 0);
> > +   0);
> > splx(s);
> > if (va == NULL) {
> > /*
> > Index: uvm/uvm_extern.h
> > ===
> > RCS file: /cvs/src/sys/uvm/uvm_extern.h,v
> > retrieving revision 1.157
> > diff -u -p -r1.157 uvm_extern.h
> > --- uvm/uvm_extern.h12 Mar 2021 14:15:49 -  1.157
> > +++ uvm/uvm_extern.h24 Apr 2021 09:53:12 -
> > @@ -297,11 +297,9 @@ intuvm_io(vm_map_t, struct uio *, 
> > int
> >  vaddr_tuvm_km_alloc1(vm_map_t, vsize_t, vsize_t, 
> > boolean_t);
> >  void   uvm_km_free(vm_map_t, vaddr_t, vsize_t);
> >  void   uvm_km_free_wakeup(vm_map_t, vaddr_t, vsize_t);
> > -vaddr_tuvm_km_kmemalloc_pla(struct vm_map *,
> > +vaddr_tuvm_km_kmemalloc(struct vm_map *,
> > struct 

Re: Stop/unstop process & xsig

2021-04-24 Thread Mark Kettenis
> Date: Sat, 24 Apr 2021 12:23:17 +0200
> From: Martin Pieuchot 
> 
> On 20/03/21(Sat) 13:25, Martin Pieuchot wrote:
> > Diff below refactors routines to stop/unstop processes and save the signal
> > number which will/can be transmitted it in wait4(2).  It does the following:
> > 
> > - Move the "hack" involving P_SINTR to avoid grabbing the SCHED_LOCK()
> >   recursively inside proc_stop().
> > 
> > - Introduce proc_unstop(), the symmetric routine to proc_stop().
> > 
> > - Manipulate `ps_xsig' only in proc_stop/unstop().
> > 
> > Ok?
> 
> Anyone?

This is not ok...

> 
> > Index: kern/kern_sig.c
> > ===
> > RCS file: /cvs/src/sys/kern/kern_sig.c,v
> > retrieving revision 1.278
> > diff -u -p -r1.278 kern_sig.c
> > --- kern/kern_sig.c 12 Mar 2021 10:13:28 -  1.278
> > +++ kern/kern_sig.c 20 Mar 2021 12:16:51 -
> > @@ -124,7 +124,7 @@ const int sigprop[NSIG + 1] = {
> >  
> >  void setsigvec(struct proc *, int, struct sigaction *);
> >  
> > -void proc_stop(struct proc *p, int);
> > +int proc_stop(struct proc *p, int, int);
> >  void proc_stop_sweep(void *);
> >  void *proc_stop_si;
> >  
> > @@ -1061,8 +1061,7 @@ ptsignal(struct proc *p, int signum, enu
> > if (pr->ps_flags & PS_PPWAIT)
> > goto out;
> > atomic_clearbits_int(siglist, mask);
> > -   pr->ps_xsig = signum;
> > -   proc_stop(p, 0);
> > +   proc_stop(p, signum, 0);
> > goto out;
> > }
> > /*
> > @@ -1170,17 +1169,12 @@ out:
> >   *
> >   * while (signum = cursig(curproc))
> >   * postsig(signum);
> > - *
> > - * Assumes that if the P_SINTR flag is set, we're holding both the
> > - * kernel and scheduler locks.
> >   */
> >  int
> >  cursig(struct proc *p)
> >  {
> > struct process *pr = p->p_p;
> > int sigpending, signum, mask, prop;
> > -   int dolock = (p->p_flag & P_SINTR) == 0;
> > -   int s;
> >  
> > KERNEL_ASSERT_LOCKED();
> >  
> > @@ -1217,31 +1211,22 @@ cursig(struct proc *p)
> >  */
> > if (((pr->ps_flags & (PS_TRACED | PS_PPWAIT)) == PS_TRACED) &&
> > signum != SIGKILL) {
> > -   pr->ps_xsig = signum;
> >  
> > single_thread_set(p, SINGLE_SUSPEND, 0);
> > -
> > -   if (dolock)
> > -   SCHED_LOCK(s);
> > -   proc_stop(p, 1);
> > -   if (dolock)
> > -   SCHED_UNLOCK(s);
> > -
> > +   signum = proc_stop(p, signum, 1);

At this point the process will sleep since proc_stop() calls
mi_switch().  At that point the debugger may clear or change the value
of ps_xsig.

> > single_thread_clear(p, 0);
> >  
> > /*
> >  * If we are no longer being traced, or the parent
> >  * didn't give us a signal, look for more signals.
> >  */
> > -   if ((pr->ps_flags & PS_TRACED) == 0 ||
> > -   pr->ps_xsig == 0)
> > +   if ((pr->ps_flags & PS_TRACED) == 0 || signum == 0)
> > continue;

So this change is wrong.

> >  
> > /*
> >  * If the new signal is being masked, look for other
> >  * signals.
> >  */
> > -   signum = pr->ps_xsig;

So this assignment is *not* redundant.

> > mask = sigmask(signum);
> > if ((p->p_sigmask & mask) != 0)
> > continue;
> > @@ -1286,12 +1271,7 @@ cursig(struct proc *p)
> > (pr->ps_pgrp->pg_jobc == 0 &&
> > prop & SA_TTYSTOP))
> > break;  /* == ignore */
> > -   pr->ps_xsig = signum;
> > -   if (dolock)
> > -   SCHED_LOCK(s);
> > -   proc_stop(p, 1);
> > -   if (dolock)
> > -   SCHED_UNLOCK(s);
> > +   proc_stop(p, signum, 1);
> > break;
> > } else if (prop & SA_IGNORE) {
> > /*
> > @@ -1331,15 +1311,21 @@ keep:
> >   * Put the argument process into the stopped state and notify the parent
> >   * via wakeup.  Signals are handled elsewhere.  The process must not be
> >   * on the run queue.
> > + *
> > + * Assumes that if the P_SINTR flag is set, we're holding the scheduler
> > + * lock.
> >   */
> > -void
> > -proc_stop(struct proc *p, int sw)
> > +int
> > +proc_stop(struct proc *p, int signum, int sw)
> >  {
> > struct process *pr = p->p_p;
> > +   int dolock = (p->p_flag & P_SINTR) == 0;
> > +  

Re: uvm_km_kmemalloc{,_pla}()

2021-04-24 Thread Mark Kettenis
> Date: Sat, 24 Apr 2021 12:02:22 +0200
> From: Martin Pieuchot 
> 
> Diff below merge the two allocators into one and remove unused
> alignment/boundary arguments.  This is a small cleanup that helps
> me keep track of the remaining allocators.
> 
> ok?

Not sure.  Is uvm_km_kmemalloc() going to be replaced in the future?

At the very least, you need to adjust the man page.

> Index: arch/arm/arm/pmap7.c
> ===
> RCS file: /cvs/src/sys/arch/arm/arm/pmap7.c,v
> retrieving revision 1.61
> diff -u -p -r1.61 pmap7.c
> --- arch/arm/arm/pmap7.c  25 Mar 2021 04:12:00 -  1.61
> +++ arch/arm/arm/pmap7.c  24 Apr 2021 09:53:11 -
> @@ -2435,8 +2435,9 @@ pmap_bootstrap_pv_page_alloc(struct pool
>   return (rv);
>   }
>  
> - new_page = uvm_km_kmemalloc(kernel_map, NULL, PAGE_SIZE,
> - (flags & PR_WAITOK) ? 0 : UVM_KMF_NOWAIT);
> + new_page = uvm_km_kmemalloc(kernel_map, NULL, PAGE_SIZE, 0,
> + (flags & PR_WAITOK) ? 0 : UVM_KMF_NOWAIT, no_constraint.ucr_low,
> + no_constraint.ucr_high, 0);
>  
>   last_bootstrap_page = new_page;
>   return ((void *)new_page);
> Index: dev/pci/envy.c
> ===
> RCS file: /cvs/src/sys/dev/pci/envy.c,v
> retrieving revision 1.81
> diff -u -p -r1.81 envy.c
> --- dev/pci/envy.c5 Jan 2020 01:07:58 -   1.81
> +++ dev/pci/envy.c24 Apr 2021 09:53:12 -
> @@ -1834,9 +1834,9 @@ envy_allocm(void *self, int dir, size_t 
>  #define ENVY_ALIGN   4
>  #define ENVY_MAXADDR ((1 << 28) - 1)
>  
> - buf->addr = (caddr_t)uvm_km_kmemalloc_pla(kernel_map,
> + buf->addr = (caddr_t)uvm_km_kmemalloc(kernel_map,
>   uvm.kernel_object, buf->size, 0, UVM_KMF_NOWAIT, 0,
> - (paddr_t)ENVY_MAXADDR, 0, 0, 1);
> + (paddr_t)ENVY_MAXADDR, 1);
>   if (buf->addr == NULL) {
>   DPRINTF("%s: unable to alloc dma segment\n", DEVNAME(sc));
>   goto err_ret;
> Index: dev/pci/if_bce.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_bce.c,v
> retrieving revision 1.53
> diff -u -p -r1.53 if_bce.c
> --- dev/pci/if_bce.c  10 Jul 2020 13:22:20 -  1.53
> +++ dev/pci/if_bce.c  24 Apr 2021 09:53:12 -
> @@ -253,9 +253,9 @@ bce_attach(struct device *parent, struct
>   bce_reset(sc);
>  
>   /* Create the data DMA region and maps. */
> - if ((sc->bce_data = (caddr_t)uvm_km_kmemalloc_pla(kernel_map,
> + if ((sc->bce_data = (caddr_t)uvm_km_kmemalloc(kernel_map,
>   uvm.kernel_object, (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES, 0,
> - UVM_KMF_NOWAIT, 0, (paddr_t)(0x4000 - 1), 0, 0, 1)) == NULL) {
> + UVM_KMF_NOWAIT, 0, (paddr_t)(0x4000 - 1), 1)) == NULL) {
>   printf(": unable to alloc space for ring");
>   return;
>   }
> Index: kern/kern_malloc.c
> ===
> RCS file: /cvs/src/sys/kern/kern_malloc.c,v
> retrieving revision 1.145
> diff -u -p -r1.145 kern_malloc.c
> --- kern/kern_malloc.c21 Apr 2021 10:02:05 -  1.145
> +++ kern/kern_malloc.c24 Apr 2021 09:53:12 -
> @@ -228,12 +228,12 @@ malloc(size_t size, int type, int flags)
>   mtx_leave(_mtx);
>   npg = atop(round_page(allocsize));
>   s = splvm();
> - va = (caddr_t)uvm_km_kmemalloc_pla(kmem_map, NULL,
> + va = (caddr_t)uvm_km_kmemalloc(kmem_map, NULL,
>   (vsize_t)ptoa(npg), 0,
>   ((flags & M_NOWAIT) ? UVM_KMF_NOWAIT : 0) |
>   ((flags & M_CANFAIL) ? UVM_KMF_CANFAIL : 0),
>   no_constraint.ucr_low, no_constraint.ucr_high,
> - 0, 0, 0);
> + 0);
>   splx(s);
>   if (va == NULL) {
>   /*
> Index: uvm/uvm_extern.h
> ===
> RCS file: /cvs/src/sys/uvm/uvm_extern.h,v
> retrieving revision 1.157
> diff -u -p -r1.157 uvm_extern.h
> --- uvm/uvm_extern.h  12 Mar 2021 14:15:49 -  1.157
> +++ uvm/uvm_extern.h  24 Apr 2021 09:53:12 -
> @@ -297,11 +297,9 @@ int  uvm_io(vm_map_t, struct uio *, 
> int
>  vaddr_t  uvm_km_alloc1(vm_map_t, vsize_t, vsize_t, 
> boolean_t);
>  void uvm_km_free(vm_map_t, vaddr_t, vsize_t);
>  void uvm_km_free_wakeup(vm_map_t, vaddr_t, vsize_t);
> -vaddr_t  uvm_km_kmemalloc_pla(struct vm_map *,
> +vaddr_t  uvm_km_kmemalloc(struct vm_map *,
>   struct uvm_object *, vsize_t, vsize_t, int,
> - paddr_t, paddr_t, paddr_t, paddr_t, int);
> -#define uvm_km_kmemalloc(map, obj, sz, flags)
> \
> - uvm_km_kmemalloc_pla(map, 

Re: Stop/unstop process & xsig

2021-04-24 Thread Martin Pieuchot
On 20/03/21(Sat) 13:25, Martin Pieuchot wrote:
> Diff below refactors routines to stop/unstop processes and save the signal
> number which will/can be transmitted it in wait4(2).  It does the following:
> 
> - Move the "hack" involving P_SINTR to avoid grabbing the SCHED_LOCK()
>   recursively inside proc_stop().
> 
> - Introduce proc_unstop(), the symmetric routine to proc_stop().
> 
> - Manipulate `ps_xsig' only in proc_stop/unstop().
> 
> Ok?

Anyone?

> Index: kern/kern_sig.c
> ===
> RCS file: /cvs/src/sys/kern/kern_sig.c,v
> retrieving revision 1.278
> diff -u -p -r1.278 kern_sig.c
> --- kern/kern_sig.c   12 Mar 2021 10:13:28 -  1.278
> +++ kern/kern_sig.c   20 Mar 2021 12:16:51 -
> @@ -124,7 +124,7 @@ const int sigprop[NSIG + 1] = {
>  
>  void setsigvec(struct proc *, int, struct sigaction *);
>  
> -void proc_stop(struct proc *p, int);
> +int proc_stop(struct proc *p, int, int);
>  void proc_stop_sweep(void *);
>  void *proc_stop_si;
>  
> @@ -1061,8 +1061,7 @@ ptsignal(struct proc *p, int signum, enu
>   if (pr->ps_flags & PS_PPWAIT)
>   goto out;
>   atomic_clearbits_int(siglist, mask);
> - pr->ps_xsig = signum;
> - proc_stop(p, 0);
> + proc_stop(p, signum, 0);
>   goto out;
>   }
>   /*
> @@ -1170,17 +1169,12 @@ out:
>   *
>   *   while (signum = cursig(curproc))
>   *   postsig(signum);
> - *
> - * Assumes that if the P_SINTR flag is set, we're holding both the
> - * kernel and scheduler locks.
>   */
>  int
>  cursig(struct proc *p)
>  {
>   struct process *pr = p->p_p;
>   int sigpending, signum, mask, prop;
> - int dolock = (p->p_flag & P_SINTR) == 0;
> - int s;
>  
>   KERNEL_ASSERT_LOCKED();
>  
> @@ -1217,31 +1211,22 @@ cursig(struct proc *p)
>*/
>   if (((pr->ps_flags & (PS_TRACED | PS_PPWAIT)) == PS_TRACED) &&
>   signum != SIGKILL) {
> - pr->ps_xsig = signum;
>  
>   single_thread_set(p, SINGLE_SUSPEND, 0);
> -
> - if (dolock)
> - SCHED_LOCK(s);
> - proc_stop(p, 1);
> - if (dolock)
> - SCHED_UNLOCK(s);
> -
> + signum = proc_stop(p, signum, 1);
>   single_thread_clear(p, 0);
>  
>   /*
>* If we are no longer being traced, or the parent
>* didn't give us a signal, look for more signals.
>*/
> - if ((pr->ps_flags & PS_TRACED) == 0 ||
> - pr->ps_xsig == 0)
> + if ((pr->ps_flags & PS_TRACED) == 0 || signum == 0)
>   continue;
>  
>   /*
>* If the new signal is being masked, look for other
>* signals.
>*/
> - signum = pr->ps_xsig;
>   mask = sigmask(signum);
>   if ((p->p_sigmask & mask) != 0)
>   continue;
> @@ -1286,12 +1271,7 @@ cursig(struct proc *p)
>   (pr->ps_pgrp->pg_jobc == 0 &&
>   prop & SA_TTYSTOP))
>   break;  /* == ignore */
> - pr->ps_xsig = signum;
> - if (dolock)
> - SCHED_LOCK(s);
> - proc_stop(p, 1);
> - if (dolock)
> - SCHED_UNLOCK(s);
> + proc_stop(p, signum, 1);
>   break;
>   } else if (prop & SA_IGNORE) {
>   /*
> @@ -1331,15 +1311,21 @@ keep:
>   * Put the argument process into the stopped state and notify the parent
>   * via wakeup.  Signals are handled elsewhere.  The process must not be
>   * on the run queue.
> + *
> + * Assumes that if the P_SINTR flag is set, we're holding the scheduler
> + * lock.
>   */
> -void
> -proc_stop(struct proc *p, int sw)
> +int
> +proc_stop(struct proc *p, int signum, int sw)
>  {
>   struct process *pr = p->p_p;
> + int dolock = (p->p_flag & P_SINTR) == 0;
> + int s;
>  
> -#ifdef MULTIPROCESSOR
> - SCHED_ASSERT_LOCKED();
> -#endif
> + pr->ps_xsig = signum;
> +
> + if (dolock)
> + SCHED_LOCK(s);
>  
>   p->p_stat = SSTOP;
>   atomic_clearbits_int(>ps_flags, PS_WAITED);
> @@ -1352,6 +1338,13 @@ proc_stop(struct proc *p, int sw)
>   softintr_schedule(proc_stop_si);
>   if (sw)
>   mi_switch();
> +
> + if (dolock)
> +  

uvm_km_kmemalloc{,_pla}()

2021-04-24 Thread Martin Pieuchot
Diff below merge the two allocators into one and remove unused
alignment/boundary arguments.  This is a small cleanup that helps
me keep track of the remaining allocators.

ok?

Index: arch/arm/arm/pmap7.c
===
RCS file: /cvs/src/sys/arch/arm/arm/pmap7.c,v
retrieving revision 1.61
diff -u -p -r1.61 pmap7.c
--- arch/arm/arm/pmap7.c25 Mar 2021 04:12:00 -  1.61
+++ arch/arm/arm/pmap7.c24 Apr 2021 09:53:11 -
@@ -2435,8 +2435,9 @@ pmap_bootstrap_pv_page_alloc(struct pool
return (rv);
}
 
-   new_page = uvm_km_kmemalloc(kernel_map, NULL, PAGE_SIZE,
-   (flags & PR_WAITOK) ? 0 : UVM_KMF_NOWAIT);
+   new_page = uvm_km_kmemalloc(kernel_map, NULL, PAGE_SIZE, 0,
+   (flags & PR_WAITOK) ? 0 : UVM_KMF_NOWAIT, no_constraint.ucr_low,
+   no_constraint.ucr_high, 0);
 
last_bootstrap_page = new_page;
return ((void *)new_page);
Index: dev/pci/envy.c
===
RCS file: /cvs/src/sys/dev/pci/envy.c,v
retrieving revision 1.81
diff -u -p -r1.81 envy.c
--- dev/pci/envy.c  5 Jan 2020 01:07:58 -   1.81
+++ dev/pci/envy.c  24 Apr 2021 09:53:12 -
@@ -1834,9 +1834,9 @@ envy_allocm(void *self, int dir, size_t 
 #define ENVY_ALIGN 4
 #define ENVY_MAXADDR   ((1 << 28) - 1)
 
-   buf->addr = (caddr_t)uvm_km_kmemalloc_pla(kernel_map,
+   buf->addr = (caddr_t)uvm_km_kmemalloc(kernel_map,
uvm.kernel_object, buf->size, 0, UVM_KMF_NOWAIT, 0,
-   (paddr_t)ENVY_MAXADDR, 0, 0, 1);
+   (paddr_t)ENVY_MAXADDR, 1);
if (buf->addr == NULL) {
DPRINTF("%s: unable to alloc dma segment\n", DEVNAME(sc));
goto err_ret;
Index: dev/pci/if_bce.c
===
RCS file: /cvs/src/sys/dev/pci/if_bce.c,v
retrieving revision 1.53
diff -u -p -r1.53 if_bce.c
--- dev/pci/if_bce.c10 Jul 2020 13:22:20 -  1.53
+++ dev/pci/if_bce.c24 Apr 2021 09:53:12 -
@@ -253,9 +253,9 @@ bce_attach(struct device *parent, struct
bce_reset(sc);
 
/* Create the data DMA region and maps. */
-   if ((sc->bce_data = (caddr_t)uvm_km_kmemalloc_pla(kernel_map,
+   if ((sc->bce_data = (caddr_t)uvm_km_kmemalloc(kernel_map,
uvm.kernel_object, (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES, 0,
-   UVM_KMF_NOWAIT, 0, (paddr_t)(0x4000 - 1), 0, 0, 1)) == NULL) {
+   UVM_KMF_NOWAIT, 0, (paddr_t)(0x4000 - 1), 1)) == NULL) {
printf(": unable to alloc space for ring");
return;
}
Index: kern/kern_malloc.c
===
RCS file: /cvs/src/sys/kern/kern_malloc.c,v
retrieving revision 1.145
diff -u -p -r1.145 kern_malloc.c
--- kern/kern_malloc.c  21 Apr 2021 10:02:05 -  1.145
+++ kern/kern_malloc.c  24 Apr 2021 09:53:12 -
@@ -228,12 +228,12 @@ malloc(size_t size, int type, int flags)
mtx_leave(_mtx);
npg = atop(round_page(allocsize));
s = splvm();
-   va = (caddr_t)uvm_km_kmemalloc_pla(kmem_map, NULL,
+   va = (caddr_t)uvm_km_kmemalloc(kmem_map, NULL,
(vsize_t)ptoa(npg), 0,
((flags & M_NOWAIT) ? UVM_KMF_NOWAIT : 0) |
((flags & M_CANFAIL) ? UVM_KMF_CANFAIL : 0),
no_constraint.ucr_low, no_constraint.ucr_high,
-   0, 0, 0);
+   0);
splx(s);
if (va == NULL) {
/*
Index: uvm/uvm_extern.h
===
RCS file: /cvs/src/sys/uvm/uvm_extern.h,v
retrieving revision 1.157
diff -u -p -r1.157 uvm_extern.h
--- uvm/uvm_extern.h12 Mar 2021 14:15:49 -  1.157
+++ uvm/uvm_extern.h24 Apr 2021 09:53:12 -
@@ -297,11 +297,9 @@ intuvm_io(vm_map_t, struct uio *, 
int
 vaddr_tuvm_km_alloc1(vm_map_t, vsize_t, vsize_t, 
boolean_t);
 void   uvm_km_free(vm_map_t, vaddr_t, vsize_t);
 void   uvm_km_free_wakeup(vm_map_t, vaddr_t, vsize_t);
-vaddr_tuvm_km_kmemalloc_pla(struct vm_map *,
+vaddr_tuvm_km_kmemalloc(struct vm_map *,
struct uvm_object *, vsize_t, vsize_t, int,
-   paddr_t, paddr_t, paddr_t, paddr_t, int);
-#define uvm_km_kmemalloc(map, obj, sz, flags)  \
-   uvm_km_kmemalloc_pla(map, obj, sz, 0, flags, 0, (paddr_t)-1, 0, 0, 0)
+   paddr_t, paddr_t, int);
 vaddr_tuvm_km_valloc(vm_map_t, vsize_t);
 vaddr_tuvm_km_valloc_try(vm_map_t, vsize_t);
 vaddr_tuvm_km_valloc_wait(vm_map_t, 

arp mbuf queue

2021-04-24 Thread Alexander Bluhm
Hi,

Hrvoje Popovski has successfully tested this diff together with my
parallel network forwarding.  A mbuf list is not MP safe and modifies
the next pointer within the mbuf.

Convert the ARP mbuf list to an mbuf queue that contins a mutex.
Update la_hold_total with atomic operations.

Note that arp_list lacks propper locking, so this diff is a start.

ok?

bluhm

Index: netinet/if_ether.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/if_ether.c,v
retrieving revision 1.245
diff -u -p -r1.245 if_ether.c
--- netinet/if_ether.c  23 Apr 2021 21:55:36 -  1.245
+++ netinet/if_ether.c  23 Apr 2021 22:43:15 -
@@ -68,7 +68,7 @@
 struct llinfo_arp {
LIST_ENTRY(llinfo_arp)   la_list;
struct rtentry  *la_rt; /* backpointer to rtentry */
-   struct mbuf_list la_ml; /* packet hold queue */
+   struct mbuf_queuela_mq; /* packet hold queue */
time_t   la_refreshed;  /* when was refresh sent */
int  la_asked;  /* number of queries sent */
 };
@@ -189,7 +189,7 @@ arp_rtrequest(struct ifnet *ifp, int req
break;
}
 
-   ml_init(>la_ml);
+   mq_init(>la_mq, LA_HOLD_QUEUE, IPL_SOFTNET);
la->la_rt = rt;
rt->rt_flags |= RTF_LLINFO;
if ((rt->rt_flags & RTF_LOCAL) == 0)
@@ -203,7 +203,7 @@ arp_rtrequest(struct ifnet *ifp, int req
LIST_REMOVE(la, la_list);
rt->rt_llinfo = NULL;
rt->rt_flags &= ~RTF_LLINFO;
-   la_hold_total -= ml_purge(>la_ml);
+   atomic_sub_int(_hold_total, mq_purge(>la_mq));
pool_put(_pool, la);
break;
 
@@ -374,18 +374,11 @@ arpresolve(struct ifnet *ifp, struct rte
 * response yet. Insert mbuf in hold queue if below limit
 * if above the limit free the queue without queuing the new packet.
 */
-   if (la_hold_total < LA_HOLD_TOTAL) {
-   struct mbuf *mh;
-
-   if (ml_len(>la_ml) >= LA_HOLD_QUEUE) {
-   mh = ml_dequeue(>la_ml);
-   la_hold_total--;
-   m_freem(mh);
-   }
-   ml_enqueue(>la_ml, m);
-   la_hold_total++;
+   if (atomic_inc_int_nv(_hold_total) <= LA_HOLD_TOTAL) {
+   if (mq_push(>la_mq, m) != 0)
+   atomic_dec_int(_hold_total);
} else {
-   la_hold_total -= ml_purge(>la_ml);
+   atomic_sub_int(_hold_total, mq_purge(>la_mq) + 1);
m_freem(m);
}
 
@@ -414,7 +407,8 @@ arpresolve(struct ifnet *ifp, struct rte
rt->rt_expire += arpt_down;
la->la_asked = 0;
la->la_refreshed = 0;
-   la_hold_total -= ml_purge(>la_ml);
+   atomic_sub_int(_hold_total,
+   mq_purge(>la_mq));
}
}
}
@@ -600,7 +594,7 @@ arpcache(struct ifnet *ifp, struct ether
struct in_addr *spa = (struct in_addr *)ea->arp_spa;
char addr[INET_ADDRSTRLEN];
struct ifnet *rifp;
-   unsigned int len;
+   struct mbuf *m;
int changed = 0;
 
KERNEL_ASSERT_LOCKED();
@@ -672,20 +666,18 @@ arpcache(struct ifnet *ifp, struct ether
 
la->la_asked = 0;
la->la_refreshed = 0;
-   while ((len = ml_len(>la_ml)) != 0) {
-   struct mbuf *mh;
+   while ((m = mq_dequeue(>la_mq)) != NULL) {
+   unsigned int len;
 
-   mh = ml_dequeue(>la_ml);
-   la_hold_total--;
+   atomic_dec_int(_hold_total);
+   len = mq_len(>la_mq);
 
-   ifp->if_output(ifp, mh, rt_key(rt), rt);
+   ifp->if_output(ifp, m, rt_key(rt), rt);
 
-   if (ml_len(>la_ml) == len) {
+   /* XXXSMP we discard if other CPU enqueues */
+   if (mq_len(>la_mq) > len) {
/* mbuf is back in queue. Discard. */
-   while ((mh = ml_dequeue(>la_ml)) != NULL) {
-   la_hold_total--;
-   m_freem(mh);
-   }
+   atomic_sub_int(_hold_total, mq_purge(>la_mq));
break;
}
}
@@ -699,7 +691,7 @@ arpinvalidate(struct rtentry *rt)
struct llinfo_arp *la = (struct llinfo_arp *)rt->rt_llinfo;
struct sockaddr_dl *sdl = satosdl(rt->rt_gateway);
 
-   la_hold_total -= ml_purge(>la_ml);
+   atomic_sub_int(_hold_total, mq_purge(>la_mq));
sdl->sdl_alen = 0;
la->la_asked = 0;
 }



Re: re-enable A-MSDU support with iwm(4) and iwx(4) fixed

2021-04-24 Thread Matthias Schmidt
Hi,

* Stefan Sperling wrote:
> On Fri, Apr 23, 2021 at 12:28:42PM +0200, Matthias Schmidt wrote:
> > I had a new kernel with only your following patch running all day and
> > never encountered the situation as described in my last email.
> > Connection was stable and transfer rates remained around 3MB/s.  This is
> > less then the rates mentioned in your recent email but I rarely get
> > higher speeds.  So I assume it is to my environment and has nothing to
> > do with the patch.
> 
> Ok. Thanks for spending the time to check this thoroughly. I have already
> made myself spend many hours debugging ghosts after test observations like
> this, many many times :-) So I've become careful about quickly drawing
> conclusions from observations like this. Wifi is complicated and as far
> as I know nobody is ever testing OpenBSD wifi patches in an RF-isolated lab.
> 
> Many problem reports I receive (and many of those arrive in private
> email, unfortunately) boil down to "it doesn't work when the cafe downstairs
> fills up with laptops and phones at lunch time" or "it only becomes slow
> when two or more laptops are doing video calls at the same time".
> The problem is that sometimes there are bugs to fix or missing features to
> implement in situations like this, so I cannot just dismiss such reports
> outright. But dealing with them can take a huge amount of time.
> To be honest, I wouldn't consider your problem a blocking issue. I would
> first wait to see if the exact same issue pops up elsewhere after the
> patch lands in the tree.

Seems I hit a byzantine situation like that.  I had the patch running on
the T450s with the 8265 and additionally on a X250 with the following
hardware all evening and morning (including suspend/resume cycles) and
it worked as expected.

iwm0 at pci2 dev 0 function 0 "Intel AC 7265" rev 0x59, msi
iwm0: hw rev 0x210, fw ver 17.3216344376.0, address

Cheers and thanks for the good work and your efforts

Matthias