bitrig branch patrick_smpns_arm created

Patrick Wildt Wed, 10 Dec 2014 00:43:06 -0800

the bitrig patrick_smpns_arm branch has been created by patrick.

it is 0 commits behind master, and 37 commits ahead.


commit 21648d6298bf3ea7c96942082b0a69e5ed408b13
diff: https://github.com/bitrig/bitrig/commit/21648d6
author: Patrick Wildt <[email protected]>
date: Wed Dec 10 09:41:51 2014 +0100

Commit what I have....

D       sys/arch/arm/include/softintr.h
M       sys/arch/armv7/exynos/exuart.c
M       sys/arch/armv7/imx/imxuart.c
M       sys/arch/armv7/include/intr.h
M       sys/arch/armv7/omap/intc.c
M       sys/arch/armv7/omap/intc.h
M       sys/arch/armv7/omap/omgpio.c
M       sys/arch/armv7/sunxi/a1xintc.c
M       sys/arch/armv7/sunxi/a1xintc.h
M       sys/arch/armv7/sunxi/sxipio.c
M       sys/arch/armv7/sunxi/sxiuart.c
M       sys/arch/armv7/virt/pl011.c
M       sys/kern/kern_ithread.c

commit 6c3b4fa351bd1b3750a69bcfc62d603ea5bb60d1
diff: https://github.com/bitrig/bitrig/commit/6c3b4fa
author: Patrick Wildt <[email protected]>
date: Tue Dec 9 08:47:36 2014 +0100

More.

M       sys/arch/arm/arm/vfp.c
M       sys/arch/arm/include/cpufunc.h
M       sys/arch/armv7/armv7/intr.c
M       sys/arch/armv7/exynos/exuart.c
M       sys/arch/armv7/include/intr.h
M       sys/kern/kern_bmtx.c
M       sys/kern/kern_ithread.c
M       sys/kern/kern_malloc.c
M       sys/kern/kern_sig.c
M       sys/kern/subr_pool.c

commit a2b1fb8f2668d245756cdc194f8ddafce43a68ef
diff: https://github.com/bitrig/bitrig/commit/a2b1fb8
author: Patrick Wildt <[email protected]>
date: Fri Oct 4 19:51:03 2013 +0200

Some ARM stuff.

M       sys/arch/arm/arm/cpu.c
M       sys/arch/arm/arm/cpuswitch.S
M       sys/arch/arm/arm/pmap.c
M       sys/arch/arm/cortex/agtimer.c
M       sys/arch/arm/cortex/ampintc.c
M       sys/arch/arm/cortex/amptimer.c
M       sys/arch/arm/include/cpu.h
M       sys/arch/armv7/armv7/armv7_machdep.c
M       sys/arch/armv7/armv7/autoconf.c
M       sys/arch/armv7/armv7/intr.c
M       sys/arch/armv7/conf/files.armv7
M       sys/arch/armv7/exynos/exgpio.c
M       sys/arch/armv7/exynos/exuart.c
M       sys/arch/armv7/imx/imxgpio.c
M       sys/arch/armv7/imx/imxuart.c
A       sys/arch/armv7/include/cpufunc.h
M       sys/arch/armv7/include/intr.h
A       sys/arch/armv7/include/intrdefs.h
M       sys/kern/kern_ithread.c

commit c086612bb833d7cada31d21e1cfcf7d5aa94435e
diff: https://github.com/bitrig/bitrig/commit/c086612
author: Christiano Haesbaert <[email protected]>
date: Wed Dec 10 02:14:52 2014 +0100

Give device names to ithreads p_comm and evcounters to softintr.

Concatenate all device names in the ithread's process name, this is
useful for debugging and/or profiling.

Also put event counters on software interrupts, I can't recall how
many times I wanted to know this.

A ps auxkl might show the following now:

root      6034  0.0  0.0      0      0 ??  DK     2:01AM    0:04.97 (softclock) 
         0     0   0 -18   0 intr
root     24717  0.0  0.0      0      0 ??  DK     2:01AM    0:01.06 (softnet)   
         0     0   0 -18   0 intr
root     25481  0.0  0.0      0      0 ??  DK     2:01AM    0:01.08 
(ehci0,mpi0)         0     0   0 -18   0 intr
root     23688  0.0  0.0      0      0 ??  DK     2:01AM    0:01.02 (em0)       
         0     0   0 -18   0 intr
....

M       sys/dev/ic/com.c
M       sys/dev/usb/usb.c
M       sys/kern/kern_clock.c
M       sys/kern/kern_ithread.c
M       sys/kern/kern_softintr.c
M       sys/net/netisr.c
M       sys/net/pipex.c
M       sys/sys/softintr.h

commit 8b9c85e4ec0b2bcabb269d1e775ef581b7c32107
diff: https://github.com/bitrig/bitrig/commit/8b9c85e
author: Christiano Haesbaert <[email protected]>
date: Mon Dec 8 23:36:55 2014 +0100

Cleanup softintr, remove establish_mpsafe now that IPL_FLAGS is here.
Also kill unused softintr.c from amd64.

ok pwildt@

D       sys/arch/amd64/amd64/softintr.c
M       sys/kern/kern_softintr.c
M       sys/sys/softintr.h

commit d992746c161f3bb065557ae4b6db5e29ab9a8b59
diff: https://github.com/bitrig/bitrig/commit/d992746
author: Patrick Wildt <[email protected]>
date: Sat Aug 23 22:52:57 2014 +0200

Revert to nearly the old softintr_ API, but keep ithreads for it.

ok haesbaert@

M       sys/arch/amd64/include/intr.h
M       sys/conf/files
M       sys/dev/ic/com.c
M       sys/dev/usb/usb.c
M       sys/kern/kern_clock.c
M       sys/kern/kern_ithread.c
M       sys/kern/kern_sig.c
A       sys/kern/kern_softintr.c
M       sys/net/netisr.c
M       sys/net/netisr.h
M       sys/net/pipex.c
M       sys/sys/ithread.h
A       sys/sys/softintr.h

commit 65a1d43d676aaa37d81e4fae9961293b0eeb96b0
diff: https://github.com/bitrig/bitrig/commit/65a1d43
author: M Farkas-Dyck <[email protected]>
date: Mon Dec 8 09:47:29 2014 -0500

unbreak bmtx_fetch_and

M       sys/kern/kern_bmtx.c

commit 58367e97e075feba5d83937e265dbeea7cebaf84
diff: https://github.com/bitrig/bitrig/commit/58367e9
author: Christiano Haesbaert <[email protected]>
date: Fri Dec 5 12:38:03 2014 +0100

Hook kernel preemption back and allow "total" preemption.

kernel.preemption = 2, will make any higher priority thread preempt
the current thread, if and only if they are on the same cpu.

Highly experimental, performance drops (since contention on kernel
lock causes even more context switches).

kernel.preemption = 1 still allows preemption just for interrupt
threads.

M       sys/kern/kern_sysctl.c
M       sys/kern/sched_bsd.c

commit 8b6a190ba6920729756088cbe2c651585338891b
diff: https://github.com/bitrig/bitrig/commit/8b6a190
author: Christiano Haesbaert <[email protected]>
date: Thu Dec 4 18:56:17 2014 +0100

Use acq_rel in atomic_fetch_{or,and} for kern_bmtx.

Also make wrapper around the atomic_c11 calls, code reads better.

M       sys/kern/kern_bmtx.c

commit 01b18d2238c9f498a455deb56b82004511fdd37a
diff: https://github.com/bitrig/bitrig/commit/01b18d2
author: Christiano Haesbaert <[email protected]>
date: Thu Dec 4 17:03:58 2014 +0100

Use uintptr_t instead of u_long in kern_bmtx.c.

M       sys/kern/kern_bmtx.c

commit d01f67541d7d3e31d8738f75670f543a25c88dbd
diff: https://github.com/bitrig/bitrig/commit/d01f675
author: Christiano Haesbaert <[email protected]>
date: Tue Dec 2 19:21:36 2014 +0100

Simplify ithread_sleep() and ithread_run().

setrunnable() is less efficient since it does unecessary computations
for a kthread. Even so, lets use it as it diminishes the amount of
replicated code.

M       sys/kern/kern_ithread.c
M       sys/kern/sched_bsd.c
M       sys/sys/proc.h

commit 832ceb3632ad69b52ae0f67a04b5d79bf251ffa3
diff: https://github.com/bitrig/bitrig/commit/832ceb3
author: Christiano Haesbaert <[email protected]>
date: Thu Oct 30 21:04:14 2014 +0100

crit_rundeferred() -> crit_unpend(), call MD intr_unpend().

Make crit_unpend() the function to be called everytime we leave a
critical section, make it call intr_unpend() which is MD.

Trying to keep the code in intr_unpend() MI now is unrealistic.

M       sys/arch/amd64/amd64/intr.c
M       sys/arch/amd64/amd64/spl.S
M       sys/arch/amd64/include/intr.h
M       sys/kern/kern_crit.c
M       sys/sys/proc.h

commit 698611370a45c152167be8d509a1f03c7bfa9cfa
diff: https://github.com/bitrig/bitrig/commit/6986113
author: Christiano Haesbaert <[email protected]>
date: Mon Sep 29 22:46:05 2014 +0200

Slightly more efficient crit_rundeferred().

Instead of disabling/enabling interrupts for every ipending
intrsource, do it once and clear it.

Another option is using atomic_exchange_explicit() but the bus locking
is unecessary since this is a cpu local word.

M       sys/kern/kern_crit.c

commit 6291497563a345c35f728e9a7c963752eb6d2338
diff: https://github.com/bitrig/bitrig/commit/6291497
author: Christiano Haesbaert <[email protected]>
date: Sun Sep 28 12:23:23 2014 +0200

Be a bit more strict with KERNEL_LOCK ordering DIAGNOSTIC.

Also make the loss of atomicity in pool_get() drop and reaquire the
critical sections. This makes the kernel lock crit depth messages go
away on boot.

M       sys/kern/kern_malloc.c
M       sys/kern/subr_pool.c
M       sys/sys/systm.h

commit 45ebbaeb109a36b8b966b7f7332748e4ff710fe6
diff: https://github.com/bitrig/bitrig/commit/45ebbae
author: Christiano Haesbaert <[email protected]>
date: Sat Sep 27 13:01:27 2014 +0200

Kill CRITCOUNTERS left overs

M       sys/kern/kern_crit.c

commit 872c57bc40056ae8fe1a3188ec0525a115f04dfa
diff: https://github.com/bitrig/bitrig/commit/872c57b
author: Christiano Haesbaert <[email protected]>
date: Sat Sep 27 12:52:48 2014 +0200

Use the intr_ as prefix for interrupt API, intr_[disable|enable|...]

Discussed with Patrick, we both agree this makes more sense than using
a suffix.

Also use intr_get_state() and intr_set_state() instead of state() and
restore().

M       sys/arch/amd64/amd64/amd64_mem.c
M       sys/arch/amd64/amd64/hibernate_machdep.c
M       sys/arch/amd64/amd64/i8259.c
M       sys/arch/amd64/amd64/ioapic.c
M       sys/arch/amd64/amd64/ipifuncs.c
M       sys/arch/amd64/amd64/lapic.c
M       sys/arch/amd64/amd64/lock_machdep.c
M       sys/arch/amd64/amd64/machdep.c
M       sys/arch/amd64/amd64/mp_setperf.c
M       sys/arch/amd64/include/cpufunc.h
M       sys/arch/amd64/isa/clock.c
M       sys/dev/acpi/acpi.c
M       sys/dev/isa/gus.c
M       sys/kern/kern_crit.c
M       sys/kern/kern_sched.c

commit 0b23d19d8347c502a9b448833273ca1bee3cf481
diff: https://github.com/bitrig/bitrig/commit/0b23d19
author: Christiano Haesbaert <[email protected]>
date: Tue Sep 16 20:21:46 2014 +0200

Add ithread.h and bmtx.h to distrib sets

M       distrib/sets/lists/base/md.amd64
M       distrib/sets/lists/comp/mi

commit 1ab4f2b6b7828387e3192494a3428f5d48299d44
diff: https://github.com/bitrig/bitrig/commit/1ab4f2b
author: Christiano Haesbaert <[email protected]>
date: Fri Aug 29 13:38:18 2014 +0200

Use more sensical labels on vector.S

M       sys/arch/amd64/amd64/vector.S

commit b29ca5e46e314644d8b87c5d011a955fe9a25da8
diff: https://github.com/bitrig/bitrig/commit/b29ca5e
author: Christiano Haesbaert <[email protected]>
date: Wed Aug 27 20:18:12 2014 +0200

Give names to bmtx locks and introduce a bmtx_dump().

The name of the lock is saved on p_wmesg in case the process is
waiting for the lock.

M       sys/kern/kern_bmtx.c
M       sys/kern/kern_lock.c
M       sys/sys/bmtx.h

commit 0d25c2f8812900efc1e306f4c68c316efcb8aef2
diff: https://github.com/bitrig/bitrig/commit/0d25c2f
author: Christiano Haesbaert <[email protected]>
date: Wed Aug 27 18:21:02 2014 +0200

Decouple the idea of masking/unmasking an intrsource from pic{}.

When we take an interrupt, we mask the source and schedule the thread,
when the thread eventually runs and finishes processing, it unmasks
the source. So there must be a way of unmasking the source in a MI
fashion, each architecture should have a intrsource_unmask() function.
In amd64, we just map it to the corresponding pic{} callback.

M       sys/arch/amd64/include/intr.h
M       sys/kern/kern_ithread.c

commit f13d4e4dab2e15d33cc145aafc6a31ceb47764d7
diff: https://github.com/bitrig/bitrig/commit/f13d4e4
author: Christiano Haesbaert <[email protected]>
date: Wed Aug 27 18:06:11 2014 +0200

Fix interrupt account with intr_shared_edge != 0.

M       sys/kern/kern_ithread.c

commit d9dea79133706562ad645364e19eba85b2d994ae
diff: https://github.com/bitrig/bitrig/commit/d9dea79
author: Christiano Haesbaert <[email protected]>
date: Wed Aug 27 15:49:12 2014 +0200

Whitespace & wording

M       sys/kern/kern_ithread.c

commit 3c6626dc76e8f4fbe278b644b4b2373d7ac095f0
diff: https://github.com/bitrig/bitrig/commit/3c6626d
author: Christiano Haesbaert <[email protected]>
date: Wed Aug 27 14:50:15 2014 +0200

Introduce intr_state_t and MI "API" for blocking interrupts.

The fact that disabling/enabling real interrupts is totally MD makes
writing portable code harder, here I propose the following API:

enable_intr(void)               Enables "all" hw interrupts.
disable_intr(void)              Disables "all" hw interrupts.
intr_state_t state_intr(void)   Reads hw interrupts state.
restore_intr(intr_state_t)      Restore hw interrupts state.

I think this should map easily to arm or any other future platform,
intr_state_t should be defined to whatever is convenient to the arch.

Switch crit_rundeferred() to use it, when we port stuff to arm, we
just need to mimick the API.

M       sys/arch/amd64/include/cpufunc.h
M       sys/kern/kern_crit.c

commit b7c7572202b636b794fc15bb9329d9c259ef58bc
diff: https://github.com/bitrig/bitrig/commit/b7c7572
author: Christiano Haesbaert <[email protected]>
date: Tue Aug 26 16:13:16 2014 +0200

Kill is_minlevel & is_maxlevel from intrsource{}.

This is a step into turning intrsource{} MI and getting rid of the IPL
leftovers.

M       sys/arch/amd64/amd64/acpi_machdep.c
M       sys/arch/amd64/amd64/genassym.cf
M       sys/arch/amd64/amd64/intr.c
M       sys/arch/amd64/amd64/machdep.c
M       sys/arch/amd64/include/intr.h
M       sys/arch/amd64/include/segments.h
M       sys/kern/kern_ithread.c

commit 0dd4f3a956101b0786a99e9b570a969748f7915a
diff: https://github.com/bitrig/bitrig/commit/0dd4f3a
author: Christiano Haesbaert <[email protected]>
date: Fri Jun 27 16:27:14 2014 +0200

Turnstile lock implementation, turn kernel lock into one.

BMTX stands for "blocking mutex", and it's implemented as a turnstile
exclusive recursive lock.

The Kernel Lock has been made a bmtx.

Background
~~~~~~~~~~

I've been tracking the performance impacts of the smpns changes and
the one thing who really hurt performance was turning kernel lock into
a rrwlock. The number of IPIs when building the kernel or world would
go over 5k/s, much higher than the <1k/s of the original mplock
implementation. That should be no surprise, since rrwlock is an always
sleeping lock which goes through wakeup/tsleep on every contention.

Performance also degraded considerably, baseline would build an image
in ~4m, taking ~5m25s after the change to rrwlock.

Therefore, the motivation for a turnstile lock was pure performance
wise, as they hold the same semantics as a sleeping lock like rwlock.

Semantics
~~~~~~~~~

 o Bmtx uses no critical section and doesn't block interrupts on any
way, they are resilient to preemption. They are adequate to interlock
against interrupt threads, so that should become the normal lock
throughout the kernel. This is the attempt of having synchronous
locking across interrupts and kernel threads.
 o Recursion is allowed, options for forbiding it in the future should
be discussed.
 o Interleaving on unlocking is allowed, the following is legal:

bmtx_lock(a);
bmtx_lock(b);
bmtx_unlock(a);     OR     bmtx_unlock(b);
bmtx_unlock(b);            bmtx_unlock(a);

 o Sleeping with a bmtx held is legal.

Implementation
~~~~~~~~~~~~~~

 o The cost for an uncontested operation is never higher than the
maximum cost of a single compare-and-set operation, average cost on a
cached situation is less than 50cycles per pair of lock/unlock on my
haswell i5 2.4ghz, considering function call overhead to be 0. We can
easily cut the function call overhead by making the function into a
macro that just tries the fast operation outside (one compare-and-set
call), if it fails, it calls into the function.

o Bmtx synchronizes across one atomic word called bmtx_lock, in which we
store:

>From bits 2-31 or 2-63: the curproc address structure.
The first two bits are abused to store two flags which are not mutually 
exclusive.
BMTX_RECURSE - signals that this bmtx was recursed.
BMTX_WAITERS - signals that there are sleepers waiting for the acquisition of 
the lock.

o Bmtx is a turnstile lock since it tries to spin _as long_ as the
owner of the lock is currently _running_ on another cpu. Under all
other conditions, the caller blocks. Blocking means setting the
BMTX_WAITERS in the atomic word and inserting itself in the bmtx's
waitqueue.

o To be able to block on the lock, we can't depend only on the atomic
word, we need a way to interlock with the other cpu, so that we don't
miss a wakeup. This is implemented using SCHED_LOCK as an interlock,
this allows us to overcome the lost-wakeup race. The algorithm is
something like this:

1 spin on the lock until holder is not SONPROC (running).  The test
for SONPROC is done on a racy fashion, which follows the process
structure and checks p->p_stat.

2 If we think the owner appears to not be SONPROC, we interlock with
the SCHED_LOCK and _retest_ the condition, therefore, recovering from
the race.

3 Assuming the owner is indeed blocked, we set BMTX_WAITERS and put
ourselves in the queue.

On the unlock path, the fast path only attempts to clear ownership if
BMTX_WAITERS is _not_ set, if it is, we need the slower path, where we
interlock with SCHED_LOCK, saving ourselves from the race.

In the future, when the scheduler doesn't suck horse-ass, we will
interlock with an arbitrary lock, something like a stripelock, instead
of a global SCHED_LOCK. We could do it with mutexes and msleep now, but
this is so ridiculous as it's just another lock to pass ownership for
the SCHED_LOCK. A performant implementation requires the scheduler to
expose the underlying interlock to the caller.

CAVEATS
~~~~~~~

On the lock path, when testing if the owner is SONPROC, we have the
following comment:
        /* XXX if we are preempted here, high chance p might have gone and
         * we're fucked. This is an "accepted" race in freebsd because of the
         * way they recycle the td structures, it might not be to us. We could
         * think of a crit_enter()/crit_leave().

We can not safely test p and deference p->p_stat, the process might have
gone through sched_exit() and p might be freed, the timing would have
to be very very very evil, I've never hit this situation and I'm
buying it for now.

Alternative solutions is changing the way we alocate proc{} (like
freebsd), or keeping track of all held mutexes and updating a state on
the mutex itself saying "OWNER_IS_RUNNING".

Results
~~~~~~~

My building times went to be as performant as baseline, sometimes
slightly faster, I could only test this in vmware so far, so I can't
have precise numbers. I would go as far as saying that it is _no_
worse than the baseline kernel.

Building time improved around 25% against the rrwlock implementation.

IPIs were reduced from 5000-6000 to 200-1200/s.

Statistics
~~~~~~~~~~

I've collected the number of fast vs slow paths by defining BMTX_STATS
on the lock, as shown below, the number of slow paths are
insignificant compared to the fast paths:

bmtx_fast_locks              21017802
bmtx_fast_unlocks            31133706
bmtx_recursed_locks          524294
bmtx_recursed_unlocks        524279
bmtx_lock_blocks             33174
bmtx_lock_blocks_cancelled   29742
bmtx_lock_blocks_slept       3862
bmtx_spun_locks              10117383
bmtx_blocked_unlocks         3866
bmtx_unlock_has_waiter       3866

M       sys/conf/files
A       sys/kern/kern_bmtx.c
M       sys/kern/kern_lock.c
A       sys/sys/bmtx.h
M       sys/sys/mplock.h
M       sys/sys/systm.h

commit 235c1d094f3c4d4aa147996a0af979d99c40e9d7
diff: https://github.com/bitrig/bitrig/commit/235c1d0
author: Christiano Haesbaert <[email protected]>
date: Fri Jun 6 11:42:01 2014 +0200

Stop using the sleepqueue for interrupt threads.

There is no need for it, it cuts some hacks and it's less
instructions. Since we always wakeup from ithread_run(), we always
know the proc to awake, the solely point of the sleepqueue is to be a
storage to be found via ident hashing.

M       sys/kern/kern_ithread.c

commit 31f7fb130f0b1c475e5e3acb05f2ffb013986e74
diff: https://github.com/bitrig/bitrig/commit/31f7fb1
author: Christiano Haesbaert <[email protected]>
date: Sun May 11 17:22:11 2014 +0200

No need for running interrupt handlers in critical sections anymore.

Preemption correctly relies on the priority to do preemption, having a
ithread->ithread preemption due to ithread sleeping at a lower
priority should be ok.

M       sys/kern/kern_ithread.c

commit 57b046f0512cfca4ab3ab928573aecd1bbff0e18
diff: https://github.com/bitrig/bitrig/commit/57b046f
author: Christiano Haesbaert <[email protected]>
date: Sat Mar 1 11:08:55 2014 +0100

Initial kernel preemption.

This diff introduces a rudimentary form of kernel preemption, it
allows interrupt threads to preempt other kernel processing, like
userland doing syscalls or other normal kthreads.

Preemption is deferred if the current process is in a critical section.

When curproc preempts:
  o It must not be stolen to run on another cpu, it
is still SRUNable, but must therefore be pinned to the curcpu()
runqueue.
  o It must continue to hold all the locks it had, in our case:
rwlocks, including kernel lock.
  o In case the _preempting_ process (not the preempted) tries to
acquire the kernel lock, but the holder was preempted, the
_preempting_ process blocks so that the _preempted_ might be resumed.

CAVEATS:

  o The way we test for preemption is still primitive, I'll only be
able to attack this properly once I introduce proper APIs into the
scheduler, but for that I want to make the scheduler modular, so that
we are able to play with different schedulers. So we need MORE APIs,
not LESS as _some_ people believe.
  o ci_want_resched which is responsible for preempting userland is
still present as at this point we need it, once I'm able to hack the
scheduler there should be only one way of preemption, so it will look
a bit funky until there.

Also introduce a sysctl (kern.preemption) which enables us to activate
or deactivate kernel preemption.

M       sys/kern/kern_crit.c
M       sys/kern/kern_exit.c
M       sys/kern/kern_ithread.c
M       sys/kern/kern_ktrace.c
M       sys/kern/kern_sched.c
M       sys/kern/kern_subr.c
M       sys/kern/kern_sysctl.c
M       sys/kern/sched_bsd.c
M       sys/kern/subr_xxx.c
M       sys/sys/proc.h
M       sys/sys/sched.h
M       sys/sys/syscall_mi.h
M       sys/sys/sysctl.h
M       sys/uvm/uvm_glue.c

commit bee73ee9eeb4bf26fb83bc20d3426e7bc65b5e85
diff: https://github.com/bitrig/bitrig/commit/bee73ee
author: Christiano Haesbaert <[email protected]>
date: Wed Feb 26 23:24:37 2014 +0100

Newly formed processes should start in a critical section.

They have the schedlock, and release on proc_trampoline_mp(), if the
newly formed process is not in a critical section, a clock interrupt
might attempt to grab the SCHED LOCK, or worse, we might preempt to
early in the future.

This paves the way for preemption and allows SCHED LOCK to be a mutex,
we only need guenther's fix for the recursive tsleep.

M       sys/kern/kern_fork.c
M       sys/sys/proc.h

commit e9ce859b3056b3825c000a6a680e52967cbca8be
diff: https://github.com/bitrig/bitrig/commit/e9ce859b
author: Christiano Haesbaert <[email protected]>
date: Sun Feb 23 22:24:24 2014 +0100

Make sure we have a process context early and kill crit_escaped.

This was lost in a merge.

M       sys/arch/amd64/amd64/machdep.c
M       sys/kern/kern_crit.c

commit c0a4ee7a816ed61dd57cf9379e9734a84a2f1520
diff: https://github.com/bitrig/bitrig/commit/c0a4ee7
author: Christiano F. Haesbaert <[email protected]>
date: Sat Feb 1 16:20:50 2014 +0100

Convert kernel lock to a rrwlock.

Introduce rrw_exit_all(), rrw_enter_cnt() and rw_held()/rrw_held().

Also, make the assert unlocked functions make sense, now we assert
that _we_ don't hold the lock, asserting that no one has the lock is
impossible.

We can't sleep on proc_trampoline_mp for the idle process, since it
should never be in a sleepqueue, I had to introduce a hack to make it
not acquire the kernel lock.

The correct fix is to pass a flag to fork1() that gets propagated to
proc_trampoline_mp().

M       sys/arch/amd64/amd64/intr.c
M       sys/arch/amd64/amd64/ipifuncs.c
M       sys/kern/kern_fork.c
M       sys/kern/kern_ithread.c
M       sys/kern/kern_lock.c
M       sys/kern/kern_rwlock.c
M       sys/kern/kern_sched.c
M       sys/kern/kern_synch.c
M       sys/sys/mplock.h
M       sys/sys/rwlock.h
M       sys/sys/systm.h

commit 20ac8a4e72412cc775579aa7299ce93afadcbca4
diff: https://github.com/bitrig/bitrig/commit/20ac8a4
author: Christiano F. Haesbaert <[email protected]>
date: Tue Dec 31 02:23:56 2013 +0100

Fix compilation of !MULTIPROCESSOR.

Armani had fixed this a while ago in the old Openbsd tree, this is
just an updated version of his diff.

spotted by aalm.

M       sys/kern/sched_bsd.c
M       sys/sys/systm.h

commit aae0871e4ee66f0d5d857b42378172433d00243f
diff: https://github.com/bitrig/bitrig/commit/aae0871
author: Christiano F. Haesbaert <[email protected]>
date: Mon Dec 30 01:32:35 2013 +0100

Remove __mp_release_all_but_one().

M       sys/arch/amd64/amd64/lock_machdep.c
M       sys/arch/amd64/include/mplock.h
M       sys/sys/mplock.h

commit cafdc540d43050dac512ae7198be0b94f4212472
diff: https://github.com/bitrig/bitrig/commit/cafdc54
author: Christiano F. Haesbaert <[email protected]>
date: Mon Dec 30 01:27:11 2013 +0100

Make so that mi_switch() won't relock sched_lock.

The caller needs to reacquire it if desired. More complex paths like
proc_stop() were left like before, so they reacquire the lock.

Almost every user of mi_switch() just do a SCHED_UNLOCK() after
return, so this commit prevents a LOCK/UNLOCK on a contented lock.

For this to work, you can never mi_switch with a recursive sched_lock,
and this is asserted.

Thanks to grunk@ for questioning me on the diff and finding a bug
which should fix suspend/resume, it was a double acquisition of
SCHED_LOCK on sched_idle(). natano@ reported the bug.

M       sys/kern/kern_ithread.c
M       sys/kern/kern_sched.c
M       sys/kern/kern_sig.c
M       sys/kern/kern_synch.c
M       sys/kern/sched_bsd.c

commit 9b11fe35aa2de30159fcc49610e32edcc038d820
diff: https://github.com/bitrig/bitrig/commit/9b11fe3
author: Christiano F. Haesbaert <[email protected]>
date: Mon Dec 30 01:21:49 2013 +0100

Make __mp_lock_held() return the lock depth.

M       sys/arch/amd64/amd64/lock_machdep.c

commit 51d4b0466f934fe104cc6b686bd6a2cb2a81e5ce
diff: https://github.com/bitrig/bitrig/commit/51d4b04
author: Christiano F. Haesbaert <[email protected]>
date: Mon Dec 30 00:39:49 2013 +0100

Rework the SCHED_LOCK vs KERNEL_LOCK dance in mi_switch().

This avoids the lock/relock to fix lock ordering on mi_switch for all
the cases except one.

Before the idea was:
        - mi_switch() releases all kernel locks before context switching.
        - save the count in the stack.
        - context switch with SCHED_LOCK held.
        - wakeup, but to assure lock ordering it needs to:
                - release SCHED_LOCK
                - reacquire KERNEL_LOCK
                - acquire SCHED_LOCK

With this diff, the caller is responsible for doing this, so you can't
enter mi_switch with kernel lock now, you must release/reacquire
yourself, being careful to always grab SCHED_LOCK before releasing the
kernel locks, if not you lose atomicity before you can.

Since we can grab kernel lock within a critical section,
KERNEL_RELOCK_ALL must do the release/reenter dance.

Next step is to make mi_switch() return with SCHED_LOCK unlocked, if
the caller wants it is his job, he _already_ lost atomicity. Most of
the cases mi_switch() relocks SCHED_LOCK only for the caller to unlock
it, which is pretty stupid.

M       sys/kern/kern_ithread.c
M       sys/kern/kern_lock.c
M       sys/kern/kern_sched.c
M       sys/kern/kern_sig.c
M       sys/kern/kern_synch.c
M       sys/kern/sched_bsd.c
M       sys/sys/proc.h
M       sys/sys/systm.h

commit d735f66eadd0bae580bad2df6c7db57c21bf44ca
diff: https://github.com/bitrig/bitrig/commit/d735f66
author: Christiano F. Haesbaert <[email protected]>
date: Sat May 25 13:19:17 2013 +0200

New interrupt model, move away from IPLs.

This diff changes the interrupt model to something very similar to
what other modern unixes like Solaris, FreeBSD, DragonflyBSD and
linux. It also introduces critical sections.

Interrupts except clock and ipi are handled by interrupt threads, when
an interrupt fires, the only job for the small interrupt stub is to
schedule the corresponding interrupt thread, when the ithread gets
scheduled, it services the interrupt.

The kernel must be made preemptive, so that interrupt threads may
preempt the current running code.

In this model, you normally never block interrupts, you rely solely on
locks to protect the code, if the ithread preempts the running code,
and it tries to acquire a contested lock, it blocks (sleeps) and gets
resumed only when that lock is released.

This allows us to have lower interrupt latency, as instead of blocking
interrupts for a long section, you can fine grain that section in a
specific lock. Instead of raising to IPL_BIO and preventing softclock
from running for example, we can make softclock preempt the old
IPL_BIO and only block if there is an actual lock contention.

This was first implemented in Solaris 2 (Kleiman & Eykholt circa 93),
they demonstrated that this model, after properly locked, changed the
worst case latency from 1s to <1ms on a single-core 40mhz machine.

It also allows for us to properly fight livelocks in the future, as
the scheduler will make sure "everything runs at some point".

The the kernel can now be synchronous with regard to locking, as you
can use the same lock to interlock against interrupts or other normal
kthreads.

You can only block on locks if you have a process context, so you
still need a way to block interrupts from normal (read: not ithread
interrupts), like clock and ipi. For that we introduce critical
sections, which block everything, in practice they're only used to
protect scheduler data and on mutexes, so they are very short.

In the future critical sections will also be the only thing that
prevents kernel preemption.

In order to prevent deadlocks, you must never be preempted while
holding a spinlock, so a critical section is implied there, this is
also akin to what every system does.

In this present state, kernel preemption has not been implemented, all
threads <IPL_CLOCK were moved to ithreads, there is no observable loss
of performance, it's been stable since the last half a year.

All spl calls <IPL_CLOCK were made no-ops, while >= IPL_CLOCK were
replaced by critical sections.

Most of the old assembly code has been rewritten in C, just because I
refuse to maintain unecessary asm blocks.

The next steps are:
        o Turn kernel lock into a rrwlock.
        o Enable kernel preemption, at this point all the interrupt
        interlocking will be done through kernel lock.
        o Decide on which subsystem to release first, having a wide subsytem 
lock.

M       sys/arch/amd64/amd64/cpu.c
M       sys/arch/amd64/amd64/db_interface.c
M       sys/arch/amd64/amd64/fpu.c
M       sys/arch/amd64/amd64/genassym.cf
M       sys/arch/amd64/amd64/intr.c
M       sys/arch/amd64/amd64/ipi.c
M       sys/arch/amd64/amd64/lapic.c
M       sys/arch/amd64/amd64/locore.S
M       sys/arch/amd64/amd64/machdep.c
M       sys/arch/amd64/amd64/mp_setperf.c
M       sys/arch/amd64/amd64/spl.S
M       sys/arch/amd64/amd64/trap.c
M       sys/arch/amd64/amd64/vector.S
M       sys/arch/amd64/amd64/via.c
M       sys/arch/amd64/conf/files.amd64
M       sys/arch/amd64/include/atomic.h
M       sys/arch/amd64/include/cpu.h
M       sys/arch/amd64/include/frame.h
M       sys/arch/amd64/include/frameasm.h
M       sys/arch/amd64/include/intr.h
M       sys/arch/amd64/include/intrdefs.h
M       sys/arch/amd64/isa/clock.c
M       sys/arch/arm/arm/db_interface.c
M       sys/arch/arm/arm/pmap.c
M       sys/conf/files
M       sys/dev/acpi/acpi.c
M       sys/dev/cardbus/cardbus.c
M       sys/dev/ic/com.c
M       sys/dev/ic/comvar.h
M       sys/dev/ic/elink3.c
M       sys/dev/ic/i82365.c
M       sys/dev/ic/tcic2.c
M       sys/dev/ic/vga.c
M       sys/dev/ic/vga_subr.c
M       sys/dev/isa/if_ef_isapnp.c
M       sys/dev/isa/pcppi.c
M       sys/dev/onewire/onewire_bitbang.c
M       sys/dev/pci/drm/drm_atomic.h
M       sys/dev/pci/drm/radeon/radeon_kms.c
M       sys/dev/pci/mbg.c
M       sys/dev/pci/pccbb.c
M       sys/dev/pci/pci.c
M       sys/dev/pci/pci_map.c
M       sys/dev/pci/pciide.c
M       sys/dev/pcmcia/if_xe.c
M       sys/dev/rasops/rasops.c
M       sys/dev/sdmmc/sdmmc_io.c
M       sys/dev/usb/ehci.c
M       sys/dev/usb/ohci.c
M       sys/dev/usb/uhci.c
M       sys/dev/usb/umidi.c
M       sys/dev/usb/usb.c
M       sys/dev/usb/usbdivar.h
M       sys/dev/usb/usbf_subr.c
M       sys/dev/usb/usbfvar.h
M       sys/dev/wsfont/wsfont.c
M       sys/kern/init_main.c
M       sys/kern/kern_clock.c
A       sys/kern/kern_crit.c
M       sys/kern/kern_event.c
M       sys/kern/kern_exec.c
M       sys/kern/kern_fork.c
A       sys/kern/kern_ithread.c
M       sys/kern/kern_lock.c
M       sys/kern/kern_mutex.c
M       sys/kern/kern_proc.c
M       sys/kern/kern_resource.c
M       sys/kern/kern_sched.c
M       sys/kern/kern_sensors.c
M       sys/kern/kern_sig.c
M       sys/kern/kern_synch.c
M       sys/kern/kern_time.c
M       sys/kern/kern_timeout.c
M       sys/kern/sched_bsd.c
M       sys/kern/subr_disk.c
M       sys/kern/subr_evcount.c
M       sys/kern/subr_hibernate.c
M       sys/kern/subr_log.c
M       sys/kern/subr_pool.c
M       sys/kern/subr_prf.c
M       sys/kern/subr_prof.c
M       sys/kern/subr_xxx.c
M       sys/kern/sys_generic.c
M       sys/kern/sys_process.c
M       sys/kern/vfs_subr.c
M       sys/kern/vfs_sync.c
M       sys/lib/libkern/libkern.h
M       sys/net/if_tun.c
M       sys/net/netisr.c
M       sys/net/netisr.h
M       sys/net/pipex.c
A       sys/sys/ithread.h
M       sys/sys/proc.h
M       sys/sys/sched.h
M       sys/sys/systm.h
M       sys/sys/timeout.h
M       sys/uvm/uvm_map.c

bitrig branch patrick_smpns_arm created

Reply via email to