the bitrig smpns branch has been created by haesbaert. it is 0 commits behind master, and 51 commits ahead.
commit 4da218f831e15c6f6ccedd4c6ae1d20b0dab6a9b diff: https://github.com/bitrig/bitrig/commit/4da218f author: Christiano Haesbaert <[email protected]> date: Wed Jan 14 14:37:25 2015 +0100 Fix the clearing of SINTR which I fucked up before. After my "fix" in 2e0fab6712b7b8272bc41ecada05114f07918d96: "Make sure we don't try to grab a bmtx with set P_SINTR." SINTR would only get cleared if we mi_switched in sleep_finish(), Which is blatantly wrong. I also blame immigrants on benefits, and the German healthcare system for lowering my ritalin dosage. M sys/kern/kern_synch.c M sys/kern/sched_bsd.c commit 23e7adc26ffeb59009679841a836a99266d86cf0 diff: https://github.com/bitrig/bitrig/commit/23e7adc author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 23:08:30 2015 +0100 Kill ih_pin in amd64, we have that in intrsource.is_pin. M sys/arch/amd64/amd64/intr.c M sys/arch/amd64/include/intr.h M sys/kern/kern_softintr.c commit b881f2f8688bc528c02a141ed76146703373247b diff: https://github.com/bitrig/bitrig/commit/b881f2f author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 18:52:39 2015 +0100 ARM atomic.h should include stdatomic if it inlines C11 atomic stuff M sys/arch/arm/include/atomic.h commit 5a9a688e7bcd030b73026e24c8109bae3b949337 diff: https://github.com/bitrig/bitrig/commit/5a9a688 author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 20:10:52 2015 +0100 Missing softintr.h include in kern_sig.c M sys/kern/kern_sig.c commit f280829290629341443385531c2c5b9653b46133 diff: https://github.com/bitrig/bitrig/commit/f280829 author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 20:46:16 2015 +0100 More missing includes found while porting smpns to ARM. M sys/dev/ic/ahcivar.h M sys/dev/sdmmc/sdmmc_io.c M sys/kern/kern_malloc.c M sys/kern/subr_disk.c M sys/kern/subr_evcount.c commit fa68070574170fb77da28c7efa5f0fab8468cdf0 diff: https://github.com/bitrig/bitrig/commit/fa68070 author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 20:08:16 2015 +0100 Missing proc.h for crit_enter(). M sys/kern/kern_sensors.c commit 029cbe89e85f3da8baeab171f41977d4c5074fd6 diff: https://github.com/bitrig/bitrig/commit/029cbe8 author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 20:02:42 2015 +0100 When we are in sched_idle() interrupts are already enabled, zap intr_enable(). This is a left over from a less civilized time. M sys/kern/kern_sched.c commit 2ed85254401251a62186c1b7dfa8d9be5f152e9b diff: https://github.com/bitrig/bitrig/commit/2ed8525 author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 17:46:33 2015 +0100 Some define alignment and whitespace cleanup. M sys/arch/arm/cortex/ampintc.c commit 57a6c6ea2d47062afa90d3b2570647708c4406a4 diff: https://github.com/bitrig/bitrig/commit/57a6c6e author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 17:39:42 2015 +0100 Rename intrsource members from iq_* to is_* and kill two unused ones. M sys/arch/arm/cortex/ampintc.c M sys/arch/armv7/include/intr.h M sys/arch/armv7/omap/intc.c M sys/arch/armv7/sunxi/a1xintc.c commit d41c39df3487c3653b999988b183566e45ae2164 diff: https://github.com/bitrig/bitrig/commit/d41c39d author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 17:32:10 2015 +0100 Rename intrq to intrsource and move to a common arm header. M sys/arch/arm/cortex/ampintc.c M sys/arch/armv7/include/intr.h M sys/arch/armv7/omap/intc.c M sys/arch/armv7/sunxi/a1xintc.c commit fed1f4c78b527121feaed5bb3db853c168ebafcc diff: https://github.com/bitrig/bitrig/commit/fed1f4c author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 18:45:04 2015 +0100 bmtx_test is amd64 only until we have a cycle counter API M sys/kern/kern_bmtx.c M sys/sys/bmtx.h commit b57dff4d481647391c42d31566bbc3ae13b394a6 diff: https://github.com/bitrig/bitrig/commit/b57dff4 author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 18:43:21 2015 +0100 If you're using crit_enter you need proc.h M sys/dev/ic/com.c M sys/dev/usb/ehci.c commit 8f53f0f59d24c4353372e9a8bd4d9dde68fdc7d6 diff: https://github.com/bitrig/bitrig/commit/8f53f0f author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 16:49:03 2015 +0100 Convert hand rolled intr handler list to TAILQ in amd64. Moves it closer to arm in making intrsource MI. M sys/arch/amd64/amd64/intr.c M sys/arch/amd64/include/intr.h M sys/arch/amd64/isa/isa_machdep.c M sys/kern/kern_ithread.c M sys/kern/kern_softintr.c commit 87e92c2ed300d5f5f1c93cd85c77dcd28247387c diff: https://github.com/bitrig/bitrig/commit/87e92c2 author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 16:44:02 2015 +0100 Fix use after free in intr_disestablish(). M sys/arch/amd64/amd64/intr.c commit 2b0ddbec352fbad2f152086a25a3dd79bf6c80e4 diff: https://github.com/bitrig/bitrig/commit/2b0ddbe author: Christiano Haesbaert <[email protected]> date: Tue Jan 13 15:06:32 2015 +0100 Cleanup genassym.cf from amd64 M sys/arch/amd64/amd64/genassym.cf commit 4a9b05144dd309d08b984f34935f111eb2015682 diff: https://github.com/bitrig/bitrig/commit/4a9b051 author: Christiano Haesbaert <[email protected]> date: Mon Jan 12 16:37:35 2015 +0100 Make sure ipi and clocks are always in a critical section. Don't assert ipi CRIT_DEPTH because of ddb and so on. M sys/arch/amd64/amd64/ipi.c M sys/arch/amd64/amd64/lapic.c M sys/arch/amd64/amd64/spl.S M sys/kern/kern_clock.c commit 9b2aee79930faa384e702b335857798910b6469f diff: https://github.com/bitrig/bitrig/commit/9b2aee7 author: Christiano Haesbaert <[email protected]> date: Mon Jan 12 15:54:27 2015 +0100 Add P_ITHREAD to the string formatting bits. M sys/sys/proc.h commit 2e0fab6712b7b8272bc41ecada05114f07918d96 diff: https://github.com/bitrig/bitrig/commit/2e0fab6 author: Christiano Haesbaert <[email protected]> date: Mon Jan 12 15:52:53 2015 +0100 Make sure we don't try to grab a bmtx with set P_SINTR. If we sleep on the bmtx with P_SINTR set, a signal might make us runnable, that is ilegal, assert it. M sys/kern/kern_bmtx.c M sys/kern/kern_synch.c M sys/kern/sched_bsd.c commit a8dd39ad263645176b3e02e339bfe4c9be338617 diff: https://github.com/bitrig/bitrig/commit/a8dd39a author: Christiano Haesbaert <[email protected]> date: Wed Dec 10 02:14:52 2014 +0100 Give device names to ithreads p_comm and evcounters to softintr. Concatenate all device names in the ithread's process name, this is useful for debugging and/or profiling. Also put event counters on software interrupts, I can't recall how many times I wanted to know this. A ps auxkl might show the following now: root 6034 0.0 0.0 0 0 ?? DK 2:01AM 0:04.97 (softclock) 0 0 0 -18 0 intr root 24717 0.0 0.0 0 0 ?? DK 2:01AM 0:01.06 (softnet) 0 0 0 -18 0 intr root 25481 0.0 0.0 0 0 ?? DK 2:01AM 0:01.08 (ehci0,mpi0) 0 0 0 -18 0 intr root 23688 0.0 0.0 0 0 ?? DK 2:01AM 0:01.02 (em0) 0 0 0 -18 0 intr .... M sys/dev/ic/com.c M sys/dev/usb/usb.c M sys/kern/kern_clock.c M sys/kern/kern_ithread.c M sys/kern/kern_softintr.c M sys/net/netisr.c M sys/net/pipex.c M sys/sys/softintr.h commit 722aab0cc07ee76c905eff038f0c5e8536c24deb diff: https://github.com/bitrig/bitrig/commit/722aab0 author: Christiano Haesbaert <[email protected]> date: Mon Dec 8 23:36:55 2014 +0100 Cleanup softintr, remove establish_mpsafe now that IPL_FLAGS is here. Also kill unused softintr.c from amd64. ok pwildt@ D sys/arch/amd64/amd64/softintr.c M sys/kern/kern_softintr.c M sys/sys/softintr.h commit 59378680a96f229f3f2368e659151b146c445031 diff: https://github.com/bitrig/bitrig/commit/5937868 author: Patrick Wildt <[email protected]> date: Sat Aug 23 22:52:57 2014 +0200 Revert to nearly the old softintr_ API, but keep ithreads for it. ok haesbaert@ M sys/arch/amd64/include/intr.h M sys/conf/files M sys/dev/ic/com.c M sys/dev/usb/usb.c M sys/kern/kern_clock.c M sys/kern/kern_ithread.c M sys/kern/kern_sig.c A sys/kern/kern_softintr.c M sys/net/netisr.c M sys/net/netisr.h M sys/net/pipex.c M sys/sys/ithread.h A sys/sys/softintr.h commit 4cda52aad456ae20f9c4443cd5f9105b45c5bceb diff: https://github.com/bitrig/bitrig/commit/4cda52a author: M Farkas-Dyck <[email protected]> date: Mon Dec 8 09:47:29 2014 -0500 unbreak bmtx_fetch_and M sys/kern/kern_bmtx.c commit b3fb09f17c9bec63ccb86595e2a90df740d4381d diff: https://github.com/bitrig/bitrig/commit/b3fb09f author: Christiano Haesbaert <[email protected]> date: Fri Dec 5 12:38:03 2014 +0100 Hook kernel preemption back and allow "total" preemption. kernel.preemption = 2, will make any higher priority thread preempt the current thread, if and only if they are on the same cpu. Highly experimental, performance drops (since contention on kernel lock causes even more context switches). kernel.preemption = 1 still allows preemption just for interrupt threads. M sys/kern/kern_sysctl.c M sys/kern/sched_bsd.c commit eb261555882514990998d08fed32046a19990c5c diff: https://github.com/bitrig/bitrig/commit/eb26155 author: Christiano Haesbaert <[email protected]> date: Thu Dec 4 18:56:17 2014 +0100 Use acq_rel in atomic_fetch_{or,and} for kern_bmtx. Also make wrapper around the atomic_c11 calls, code reads better. M sys/kern/kern_bmtx.c commit e58fc5a05741817f4ae02cfd62a56b4fd9209c18 diff: https://github.com/bitrig/bitrig/commit/e58fc5a author: Christiano Haesbaert <[email protected]> date: Thu Dec 4 17:03:58 2014 +0100 Use uintptr_t instead of u_long in kern_bmtx.c. M sys/kern/kern_bmtx.c commit 72602d41e7ac8e28f5d011d2a740ef5b9710cd05 diff: https://github.com/bitrig/bitrig/commit/72602d4 author: Christiano Haesbaert <[email protected]> date: Tue Dec 2 19:21:36 2014 +0100 Simplify ithread_sleep() and ithread_run(). setrunnable() is less efficient since it does unecessary computations for a kthread. Even so, lets use it as it diminishes the amount of replicated code. M sys/kern/kern_ithread.c M sys/kern/sched_bsd.c M sys/sys/proc.h commit 6b3ff84f60f852972eb0e8d8b92814b1b4bc5475 diff: https://github.com/bitrig/bitrig/commit/6b3ff84 author: Christiano Haesbaert <[email protected]> date: Thu Oct 30 21:04:14 2014 +0100 crit_rundeferred() -> crit_unpend(), call MD intr_unpend(). Make crit_unpend() the function to be called everytime we leave a critical section, make it call intr_unpend() which is MD. Trying to keep the code in intr_unpend() MI now is unrealistic. M sys/arch/amd64/amd64/intr.c M sys/arch/amd64/amd64/spl.S M sys/arch/amd64/include/intr.h M sys/kern/kern_crit.c M sys/sys/proc.h commit 5ccfcb0185a7c4333243cb4c1a817084d8f769dc diff: https://github.com/bitrig/bitrig/commit/5ccfcb0 author: Christiano Haesbaert <[email protected]> date: Mon Sep 29 22:46:05 2014 +0200 Slightly more efficient crit_rundeferred(). Instead of disabling/enabling interrupts for every ipending intrsource, do it once and clear it. Another option is using atomic_exchange_explicit() but the bus locking is unecessary since this is a cpu local word. M sys/kern/kern_crit.c commit d09f098c66d165ed4e8541c62ec2c2222ec0efca diff: https://github.com/bitrig/bitrig/commit/d09f098 author: Christiano Haesbaert <[email protected]> date: Sun Sep 28 12:23:23 2014 +0200 Be a bit more strict with KERNEL_LOCK ordering DIAGNOSTIC. Also make the loss of atomicity in pool_get() drop and reaquire the critical sections. This makes the kernel lock crit depth messages go away on boot. M sys/kern/kern_malloc.c M sys/kern/subr_pool.c M sys/sys/systm.h commit a506af46b5cd9ee732f9686a1e46925a2bc4f964 diff: https://github.com/bitrig/bitrig/commit/a506af4 author: Christiano Haesbaert <[email protected]> date: Sat Sep 27 13:01:27 2014 +0200 Kill CRITCOUNTERS left overs M sys/kern/kern_crit.c commit 665cc907f77789b3abc836f8be5ea27464ae109f diff: https://github.com/bitrig/bitrig/commit/665cc90 author: Christiano Haesbaert <[email protected]> date: Sat Sep 27 12:52:48 2014 +0200 Use the intr_ as prefix for interrupt API, intr_[disable|enable|...] Discussed with Patrick, we both agree this makes more sense than using a suffix. Also use intr_get_state() and intr_set_state() instead of state() and restore(). M sys/arch/amd64/amd64/amd64_mem.c M sys/arch/amd64/amd64/hibernate_machdep.c M sys/arch/amd64/amd64/i8259.c M sys/arch/amd64/amd64/ioapic.c M sys/arch/amd64/amd64/ipifuncs.c M sys/arch/amd64/amd64/lapic.c M sys/arch/amd64/amd64/lock_machdep.c M sys/arch/amd64/amd64/machdep.c M sys/arch/amd64/include/cpufunc.h M sys/arch/amd64/isa/clock.c M sys/dev/acpi/acpi.c M sys/dev/isa/gus.c M sys/kern/kern_crit.c M sys/kern/kern_sched.c commit 160ce7ebbabf9a9447e9c73712417ba444c291c1 diff: https://github.com/bitrig/bitrig/commit/160ce7e author: Christiano Haesbaert <[email protected]> date: Tue Sep 16 20:21:46 2014 +0200 Add ithread.h and bmtx.h to distrib sets M distrib/sets/lists/base/md.amd64 M distrib/sets/lists/comp/mi commit 445a62c9862f09baba96f067f64f0da80d3cd82b diff: https://github.com/bitrig/bitrig/commit/445a62c author: Christiano Haesbaert <[email protected]> date: Fri Aug 29 13:38:18 2014 +0200 Use more sensical labels on vector.S M sys/arch/amd64/amd64/vector.S commit ea0ff4eb1271d65172134cb4668384c17c1919b3 diff: https://github.com/bitrig/bitrig/commit/ea0ff4e author: Christiano Haesbaert <[email protected]> date: Wed Aug 27 20:18:12 2014 +0200 Give names to bmtx locks and introduce a bmtx_dump(). The name of the lock is saved on p_wmesg in case the process is waiting for the lock. M sys/kern/kern_bmtx.c M sys/kern/kern_lock.c M sys/sys/bmtx.h commit b7f2d7ba788a8e0de9b591c5261a92f766497c56 diff: https://github.com/bitrig/bitrig/commit/b7f2d7b author: Christiano Haesbaert <[email protected]> date: Wed Aug 27 18:21:02 2014 +0200 Decouple the idea of masking/unmasking an intrsource from pic{}. When we take an interrupt, we mask the source and schedule the thread, when the thread eventually runs and finishes processing, it unmasks the source. So there must be a way of unmasking the source in a MI fashion, each architecture should have a intrsource_unmask() function. In amd64, we just map it to the corresponding pic{} callback. M sys/arch/amd64/include/intr.h M sys/kern/kern_ithread.c commit e3d1cd48f45b64bb70fa2059e43976535a083bf1 diff: https://github.com/bitrig/bitrig/commit/e3d1cd4 author: Christiano Haesbaert <[email protected]> date: Wed Aug 27 18:06:11 2014 +0200 Fix interrupt account with intr_shared_edge != 0. M sys/kern/kern_ithread.c commit 7c5eb26a00cfd7b48a5aa66f111898ae1eee164c diff: https://github.com/bitrig/bitrig/commit/7c5eb26 author: Christiano Haesbaert <[email protected]> date: Wed Aug 27 15:49:12 2014 +0200 Whitespace & wording M sys/kern/kern_ithread.c commit f3841c34f8b99a959e5a37c9c5221cc9df45807b diff: https://github.com/bitrig/bitrig/commit/f3841c3 author: Christiano Haesbaert <[email protected]> date: Wed Aug 27 14:50:15 2014 +0200 Introduce intr_state_t and MI "API" for blocking interrupts. The fact that disabling/enabling real interrupts is totally MD makes writing portable code harder, here I propose the following API: enable_intr(void) Enables "all" hw interrupts. disable_intr(void) Disables "all" hw interrupts. intr_state_t state_intr(void) Reads hw interrupts state. restore_intr(intr_state_t) Restore hw interrupts state. I think this should map easily to arm or any other future platform, intr_state_t should be defined to whatever is convenient to the arch. Switch crit_rundeferred() to use it, when we port stuff to arm, we just need to mimick the API. M sys/arch/amd64/include/cpufunc.h M sys/kern/kern_crit.c commit dab636e846d04e85dcd1886e0823b4d86e7e4134 diff: https://github.com/bitrig/bitrig/commit/dab636e author: Christiano Haesbaert <[email protected]> date: Tue Aug 26 16:13:16 2014 +0200 Kill is_minlevel & is_maxlevel from intrsource{}. This is a step into turning intrsource{} MI and getting rid of the IPL leftovers. M sys/arch/amd64/amd64/acpi_machdep.c M sys/arch/amd64/amd64/genassym.cf M sys/arch/amd64/amd64/intr.c M sys/arch/amd64/amd64/machdep.c M sys/arch/amd64/include/intr.h M sys/arch/amd64/include/segments.h M sys/kern/kern_ithread.c commit e7b89067fa1ddc1dcad44adb95ef35f64c26967e diff: https://github.com/bitrig/bitrig/commit/e7b8906 author: Christiano Haesbaert <[email protected]> date: Fri Jun 27 16:27:14 2014 +0200 Turnstile lock implementation, turn kernel lock into one. BMTX stands for "blocking mutex", and it's implemented as a turnstile exclusive recursive lock. The Kernel Lock has been made a bmtx. Background ~~~~~~~~~~ I've been tracking the performance impacts of the smpns changes and the one thing who really hurt performance was turning kernel lock into a rrwlock. The number of IPIs when building the kernel or world would go over 5k/s, much higher than the <1k/s of the original mplock implementation. That should be no surprise, since rrwlock is an always sleeping lock which goes through wakeup/tsleep on every contention. Performance also degraded considerably, baseline would build an image in ~4m, taking ~5m25s after the change to rrwlock. Therefore, the motivation for a turnstile lock was pure performance wise, as they hold the same semantics as a sleeping lock like rwlock. Semantics ~~~~~~~~~ o Bmtx uses no critical section and doesn't block interrupts on any way, they are resilient to preemption. They are adequate to interlock against interrupt threads, so that should become the normal lock throughout the kernel. This is the attempt of having synchronous locking across interrupts and kernel threads. o Recursion is allowed, options for forbiding it in the future should be discussed. o Interleaving on unlocking is allowed, the following is legal: bmtx_lock(a); bmtx_lock(b); bmtx_unlock(a); OR bmtx_unlock(b); bmtx_unlock(b); bmtx_unlock(a); o Sleeping with a bmtx held is legal. Implementation ~~~~~~~~~~~~~~ o The cost for an uncontested operation is never higher than the maximum cost of a single compare-and-set operation, average cost on a cached situation is less than 50cycles per pair of lock/unlock on my haswell i5 2.4ghz, considering function call overhead to be 0. We can easily cut the function call overhead by making the function into a macro that just tries the fast operation outside (one compare-and-set call), if it fails, it calls into the function. o Bmtx synchronizes across one atomic word called bmtx_lock, in which we store: >From bits 2-31 or 2-63: the curproc address structure. The first two bits are abused to store two flags which are not mutually exclusive. BMTX_RECURSE - signals that this bmtx was recursed. BMTX_WAITERS - signals that there are sleepers waiting for the acquisition of the lock. o Bmtx is a turnstile lock since it tries to spin _as long_ as the owner of the lock is currently _running_ on another cpu. Under all other conditions, the caller blocks. Blocking means setting the BMTX_WAITERS in the atomic word and inserting itself in the bmtx's waitqueue. o To be able to block on the lock, we can't depend only on the atomic word, we need a way to interlock with the other cpu, so that we don't miss a wakeup. This is implemented using SCHED_LOCK as an interlock, this allows us to overcome the lost-wakeup race. The algorithm is something like this: 1 spin on the lock until holder is not SONPROC (running). The test for SONPROC is done on a racy fashion, which follows the process structure and checks p->p_stat. 2 If we think the owner appears to not be SONPROC, we interlock with the SCHED_LOCK and _retest_ the condition, therefore, recovering from the race. 3 Assuming the owner is indeed blocked, we set BMTX_WAITERS and put ourselves in the queue. On the unlock path, the fast path only attempts to clear ownership if BMTX_WAITERS is _not_ set, if it is, we need the slower path, where we interlock with SCHED_LOCK, saving ourselves from the race. In the future, when the scheduler doesn't suck horse-ass, we will interlock with an arbitrary lock, something like a stripelock, instead of a global SCHED_LOCK. We could do it with mutexes and msleep now, but this is so ridiculous as it's just another lock to pass ownership for the SCHED_LOCK. A performant implementation requires the scheduler to expose the underlying interlock to the caller. CAVEATS ~~~~~~~ On the lock path, when testing if the owner is SONPROC, we have the following comment: /* XXX if we are preempted here, high chance p might have gone and * we're fucked. This is an "accepted" race in freebsd because of the * way they recycle the td structures, it might not be to us. We could * think of a crit_enter()/crit_leave(). We can not safely test p and deference p->p_stat, the process might have gone through sched_exit() and p might be freed, the timing would have to be very very very evil, I've never hit this situation and I'm buying it for now. Alternative solutions is changing the way we alocate proc{} (like freebsd), or keeping track of all held mutexes and updating a state on the mutex itself saying "OWNER_IS_RUNNING". Results ~~~~~~~ My building times went to be as performant as baseline, sometimes slightly faster, I could only test this in vmware so far, so I can't have precise numbers. I would go as far as saying that it is _no_ worse than the baseline kernel. Building time improved around 25% against the rrwlock implementation. IPIs were reduced from 5000-6000 to 200-1200/s. Statistics ~~~~~~~~~~ I've collected the number of fast vs slow paths by defining BMTX_STATS on the lock, as shown below, the number of slow paths are insignificant compared to the fast paths: bmtx_fast_locks 21017802 bmtx_fast_unlocks 31133706 bmtx_recursed_locks 524294 bmtx_recursed_unlocks 524279 bmtx_lock_blocks 33174 bmtx_lock_blocks_cancelled 29742 bmtx_lock_blocks_slept 3862 bmtx_spun_locks 10117383 bmtx_blocked_unlocks 3866 bmtx_unlock_has_waiter 3866 M sys/conf/files A sys/kern/kern_bmtx.c M sys/kern/kern_lock.c A sys/sys/bmtx.h M sys/sys/mplock.h M sys/sys/systm.h commit 6d9ec437d7cda829be660b09d4a656a0f5e772e5 diff: https://github.com/bitrig/bitrig/commit/6d9ec43 author: Christiano Haesbaert <[email protected]> date: Fri Jun 6 11:42:01 2014 +0200 Stop using the sleepqueue for interrupt threads. There is no need for it, it cuts some hacks and it's less instructions. Since we always wakeup from ithread_run(), we always know the proc to awake, the solely point of the sleepqueue is to be a storage to be found via ident hashing. M sys/kern/kern_ithread.c commit 31f8fd64850b581c75b21a5ef1898f78543900cb diff: https://github.com/bitrig/bitrig/commit/31f8fd6 author: Christiano Haesbaert <[email protected]> date: Sun May 11 17:22:11 2014 +0200 No need for running interrupt handlers in critical sections anymore. Preemption correctly relies on the priority to do preemption, having a ithread->ithread preemption due to ithread sleeping at a lower priority should be ok. M sys/kern/kern_ithread.c commit 20277823329ba17897981a4f9c8e72024b5b7b72 diff: https://github.com/bitrig/bitrig/commit/2027782 author: Christiano Haesbaert <[email protected]> date: Sat Mar 1 11:08:55 2014 +0100 Initial kernel preemption. This diff introduces a rudimentary form of kernel preemption, it allows interrupt threads to preempt other kernel processing, like userland doing syscalls or other normal kthreads. Preemption is deferred if the current process is in a critical section. When curproc preempts: o It must not be stolen to run on another cpu, it is still SRUNable, but must therefore be pinned to the curcpu() runqueue. o It must continue to hold all the locks it had, in our case: rwlocks, including kernel lock. o In case the _preempting_ process (not the preempted) tries to acquire the kernel lock, but the holder was preempted, the _preempting_ process blocks so that the _preempted_ might be resumed. CAVEATS: o The way we test for preemption is still primitive, I'll only be able to attack this properly once I introduce proper APIs into the scheduler, but for that I want to make the scheduler modular, so that we are able to play with different schedulers. So we need MORE APIs, not LESS as _some_ people believe. o ci_want_resched which is responsible for preempting userland is still present as at this point we need it, once I'm able to hack the scheduler there should be only one way of preemption, so it will look a bit funky until there. Also introduce a sysctl (kern.preemption) which enables us to activate or deactivate kernel preemption. M sys/kern/kern_crit.c M sys/kern/kern_exit.c M sys/kern/kern_ithread.c M sys/kern/kern_ktrace.c M sys/kern/kern_sched.c M sys/kern/kern_subr.c M sys/kern/kern_sysctl.c M sys/kern/sched_bsd.c M sys/kern/subr_xxx.c M sys/sys/proc.h M sys/sys/sched.h M sys/sys/syscall_mi.h M sys/sys/sysctl.h M sys/uvm/uvm_glue.c commit 6a011b0aa41ce11db667b570b4705e5d7cbdfa14 diff: https://github.com/bitrig/bitrig/commit/6a011b0 author: Christiano Haesbaert <[email protected]> date: Wed Feb 26 23:24:37 2014 +0100 Newly formed processes should start in a critical section. They have the schedlock, and release on proc_trampoline_mp(), if the newly formed process is not in a critical section, a clock interrupt might attempt to grab the SCHED LOCK, or worse, we might preempt to early in the future. This paves the way for preemption and allows SCHED LOCK to be a mutex, we only need guenther's fix for the recursive tsleep. M sys/kern/kern_fork.c M sys/sys/proc.h commit 3d0d940f5b40a6a7b41c62ba5b58733108baf57a diff: https://github.com/bitrig/bitrig/commit/3d0d940 author: Christiano F. Haesbaert <[email protected]> date: Sat Feb 1 16:20:50 2014 +0100 Convert kernel lock to a rrwlock. Introduce rrw_exit_all(), rrw_enter_cnt() and rw_held()/rrw_held(). Also, make the assert unlocked functions make sense, now we assert that _we_ don't hold the lock, asserting that no one has the lock is impossible. We can't sleep on proc_trampoline_mp for the idle process, since it should never be in a sleepqueue, I had to introduce a hack to make it not acquire the kernel lock. The correct fix is to pass a flag to fork1() that gets propagated to proc_trampoline_mp(). M sys/arch/amd64/amd64/intr.c M sys/arch/amd64/amd64/ipifuncs.c M sys/kern/kern_fork.c M sys/kern/kern_ithread.c M sys/kern/kern_lock.c M sys/kern/kern_rwlock.c M sys/kern/kern_sched.c M sys/kern/kern_synch.c M sys/sys/mplock.h M sys/sys/rwlock.h M sys/sys/systm.h commit 42c13eb9ab00946a42043f44a8dce7a40f75bac2 diff: https://github.com/bitrig/bitrig/commit/42c13eb author: Christiano F. Haesbaert <[email protected]> date: Tue Dec 31 02:23:56 2013 +0100 Fix compilation of !MULTIPROCESSOR. Armani had fixed this a while ago in the old Openbsd tree, this is just an updated version of his diff. spotted by aalm. M sys/kern/sched_bsd.c M sys/sys/systm.h commit 2ef003c01889a9fb42b0f47fe23f96bb3e7ebce1 diff: https://github.com/bitrig/bitrig/commit/2ef003c author: Christiano F. Haesbaert <[email protected]> date: Mon Dec 30 01:32:35 2013 +0100 Remove __mp_release_all_but_one(). M sys/arch/amd64/amd64/lock_machdep.c M sys/arch/amd64/include/mplock.h M sys/sys/mplock.h commit 613491f24f22b6bc47e2bc8bde5de32c4a1f7a0f diff: https://github.com/bitrig/bitrig/commit/613491f author: Christiano F. Haesbaert <[email protected]> date: Mon Dec 30 01:27:11 2013 +0100 Make so that mi_switch() won't relock sched_lock. The caller needs to reacquire it if desired. More complex paths like proc_stop() were left like before, so they reacquire the lock. Almost every user of mi_switch() just do a SCHED_UNLOCK() after return, so this commit prevents a LOCK/UNLOCK on a contented lock. For this to work, you can never mi_switch with a recursive sched_lock, and this is asserted. Thanks to grunk@ for questioning me on the diff and finding a bug which should fix suspend/resume, it was a double acquisition of SCHED_LOCK on sched_idle(). natano@ reported the bug. M sys/kern/kern_ithread.c M sys/kern/kern_sched.c M sys/kern/kern_sig.c M sys/kern/kern_synch.c M sys/kern/sched_bsd.c commit c0feaac7710dd02810821912d0e6a4f2be92c146 diff: https://github.com/bitrig/bitrig/commit/c0feaac author: Christiano F. Haesbaert <[email protected]> date: Mon Dec 30 01:21:49 2013 +0100 Make __mp_lock_held() return the lock depth. M sys/arch/amd64/amd64/lock_machdep.c commit 118d50036cde85c0cb46606147e1489782f2924a diff: https://github.com/bitrig/bitrig/commit/118d500 author: Christiano F. Haesbaert <[email protected]> date: Mon Dec 30 00:39:49 2013 +0100 Rework the SCHED_LOCK vs KERNEL_LOCK dance in mi_switch(). This avoids the lock/relock to fix lock ordering on mi_switch for all the cases except one. Before the idea was: - mi_switch() releases all kernel locks before context switching. - save the count in the stack. - context switch with SCHED_LOCK held. - wakeup, but to assure lock ordering it needs to: - release SCHED_LOCK - reacquire KERNEL_LOCK - acquire SCHED_LOCK With this diff, the caller is responsible for doing this, so you can't enter mi_switch with kernel lock now, you must release/reacquire yourself, being careful to always grab SCHED_LOCK before releasing the kernel locks, if not you lose atomicity before you can. Since we can grab kernel lock within a critical section, KERNEL_RELOCK_ALL must do the release/reenter dance. Next step is to make mi_switch() return with SCHED_LOCK unlocked, if the caller wants it is his job, he _already_ lost atomicity. Most of the cases mi_switch() relocks SCHED_LOCK only for the caller to unlock it, which is pretty stupid. M sys/kern/kern_ithread.c M sys/kern/kern_lock.c M sys/kern/kern_sched.c M sys/kern/kern_sig.c M sys/kern/kern_synch.c M sys/kern/sched_bsd.c M sys/sys/proc.h M sys/sys/systm.h commit c98d82ccbc713ae6773a162e1befe8669731c93c diff: https://github.com/bitrig/bitrig/commit/c98d82c author: Christiano F. Haesbaert <[email protected]> date: Sat May 25 13:19:17 2013 +0200 New interrupt model, move away from IPLs. This diff changes the interrupt model to something very similar to what other modern unixes like Solaris, FreeBSD, DragonflyBSD and linux. It also introduces critical sections. Interrupts except clock and ipi are handled by interrupt threads, when an interrupt fires, the only job for the small interrupt stub is to schedule the corresponding interrupt thread, when the ithread gets scheduled, it services the interrupt. The kernel must be made preemptive, so that interrupt threads may preempt the current running code. In this model, you normally never block interrupts, you rely solely on locks to protect the code, if the ithread preempts the running code, and it tries to acquire a contested lock, it blocks (sleeps) and gets resumed only when that lock is released. This allows us to have lower interrupt latency, as instead of blocking interrupts for a long section, you can fine grain that section in a specific lock. Instead of raising to IPL_BIO and preventing softclock from running for example, we can make softclock preempt the old IPL_BIO and only block if there is an actual lock contention. This was first implemented in Solaris 2 (Kleiman & Eykholt circa 93), they demonstrated that this model, after properly locked, changed the worst case latency from 1s to <1ms on a single-core 40mhz machine. It also allows for us to properly fight livelocks in the future, as the scheduler will make sure "everything runs at some point". The the kernel can now be synchronous with regard to locking, as you can use the same lock to interlock against interrupts or other normal kthreads. You can only block on locks if you have a process context, so you still need a way to block interrupts from normal (read: not ithread interrupts), like clock and ipi. For that we introduce critical sections, which block everything, in practice they're only used to protect scheduler data and on mutexes, so they are very short. In the future critical sections will also be the only thing that prevents kernel preemption. In order to prevent deadlocks, you must never be preempted while holding a spinlock, so a critical section is implied there, this is also akin to what every system does. In this present state, kernel preemption has not been implemented, all threads <IPL_CLOCK were moved to ithreads, there is no observable loss of performance, it's been stable since the last half a year. All spl calls <IPL_CLOCK were made no-ops, while >= IPL_CLOCK were replaced by critical sections. Most of the old assembly code has been rewritten in C, just because I refuse to maintain unecessary asm blocks. The next steps are: o Turn kernel lock into a rrwlock. o Enable kernel preemption, at this point all the interrupt interlocking will be done through kernel lock. o Decide on which subsystem to release first, having a wide subsytem lock. Make sure we have a process context early and kill crit_escaped. This was lost in a merge. M sys/arch/amd64/amd64/cpu.c M sys/arch/amd64/amd64/db_interface.c M sys/arch/amd64/amd64/fpu.c M sys/arch/amd64/amd64/genassym.cf M sys/arch/amd64/amd64/intr.c M sys/arch/amd64/amd64/ipi.c M sys/arch/amd64/amd64/lapic.c M sys/arch/amd64/amd64/locore.S M sys/arch/amd64/amd64/machdep.c M sys/arch/amd64/amd64/spl.S M sys/arch/amd64/amd64/trap.c M sys/arch/amd64/amd64/vector.S M sys/arch/amd64/amd64/via.c M sys/arch/amd64/conf/files.amd64 M sys/arch/amd64/include/atomic.h M sys/arch/amd64/include/cpu.h M sys/arch/amd64/include/frame.h M sys/arch/amd64/include/frameasm.h M sys/arch/amd64/include/intr.h M sys/arch/amd64/include/intrdefs.h M sys/arch/amd64/isa/clock.c M sys/arch/arm/arm/db_interface.c M sys/arch/arm/arm/pmap.c M sys/conf/files M sys/dev/acpi/acpi.c M sys/dev/cardbus/cardbus.c M sys/dev/ic/com.c M sys/dev/ic/comvar.h M sys/dev/ic/elink3.c M sys/dev/ic/i82365.c M sys/dev/ic/vga.c M sys/dev/ic/vga_subr.c M sys/dev/isa/if_ef_isapnp.c M sys/dev/isa/pcppi.c M sys/dev/onewire/onewire_bitbang.c M sys/dev/pci/drm/drm_atomic.h M sys/dev/pci/drm/radeon/radeon_kms.c M sys/dev/pci/mbg.c M sys/dev/pci/pccbb.c M sys/dev/pci/pci.c M sys/dev/pci/pci_map.c M sys/dev/pci/pciide.c M sys/dev/pcmcia/if_xe.c M sys/dev/rasops/rasops.c M sys/dev/sdmmc/sdmmc_io.c M sys/dev/usb/ehci.c M sys/dev/usb/ohci.c M sys/dev/usb/uhci.c M sys/dev/usb/umidi.c M sys/dev/usb/usb.c M sys/dev/usb/usbdivar.h M sys/dev/usb/usbf_subr.c M sys/dev/usb/usbfvar.h M sys/dev/usb/xhci.c M sys/dev/wsfont/wsfont.c M sys/kern/init_main.c M sys/kern/kern_clock.c A sys/kern/kern_crit.c M sys/kern/kern_event.c M sys/kern/kern_exec.c M sys/kern/kern_fork.c A sys/kern/kern_ithread.c M sys/kern/kern_lock.c M sys/kern/kern_mutex.c M sys/kern/kern_proc.c M sys/kern/kern_resource.c M sys/kern/kern_sched.c M sys/kern/kern_sensors.c M sys/kern/kern_sig.c M sys/kern/kern_synch.c M sys/kern/kern_time.c M sys/kern/kern_timeout.c M sys/kern/sched_bsd.c M sys/kern/subr_disk.c M sys/kern/subr_evcount.c M sys/kern/subr_hibernate.c M sys/kern/subr_log.c M sys/kern/subr_pool.c M sys/kern/subr_prf.c M sys/kern/subr_prof.c M sys/kern/subr_xxx.c M sys/kern/sys_generic.c M sys/kern/sys_process.c M sys/kern/vfs_subr.c M sys/kern/vfs_sync.c M sys/lib/libkern/libkern.h M sys/net/if_tun.c M sys/net/netisr.c M sys/net/netisr.h M sys/net/pipex.c A sys/sys/ithread.h M sys/sys/proc.h M sys/sys/sched.h M sys/sys/systm.h M sys/sys/timeout.h M sys/uvm/uvm_map.c
