Re: ULE patch, call for testers
On 2012/11/05 17:13, Andriy Gapon wrote: on 05/11/2012 04:41 David Xu said the following: Another problem I remembered is that a thread on runqueue may be starved because ULE treats a sleeping thread and a thread waiting on runqueue differently. If a thread has slept for a while, after it is woken up, its priority is boosted, but for a thread on runqueue, its priority will never be boosted. In essential, they should be same becase both of them are waiting for cpu. If I am a thread, I'd like to wait on sleep queue rather than on runqueue, since in former case, I will get bonus, while in later case, I'll get nothing. Under heavy load, there are many runnable threads, this unfair can cause a very low priority thread on runqueue to be starved. 4BSD seems not suffer from this problem, because it also decay cpu time of thread on runqueue. I think ULE needs some anti-starvation code to give thread a shot if it is waiting on runqueue too long time. I also noticed this issue and I've been playing with the following patch. Two points: o I am not sure if it is ideologically correct o it didn't improve much the behavior of my workloads In any case, here it is: - extend accounted interactive sleep time to a point where a thread runs (as opposed to be added to runq) --- a/sys/kern/sched_ule.c +++ b/sys/kern/sched_ule.c @@ -1898,8 +1899,21 @@ sched_switch(struct thread *td, struct thread *newtd, int flags) SDT_PROBE2(sched, , , off_cpu, td, td->td_proc); lock_profile_release_lock(&TDQ_LOCKPTR(tdq)->lock_object); TDQ_LOCKPTR(tdq)->mtx_lock = (uintptr_t)newtd; +#if 1 + /* +* If we slept for more than a tick update our interactivity and +* priority. +*/ + int slptick; + slptick = newtd->td_slptick; + newtd->td_slptick = 0; + if (slptick && slptick != ticks) { + newtd->td_sched->ts_slptime += + (ticks - slptick) << SCHED_TICK_SHIFT; + sched_interact_update(newtd); + } +#endif sched_pctcpu_update(newtd->td_sched, 0); - #ifdef KDTRACE_HOOKS /* * If DTrace has set the active vtime enum to anything @@ -1990,6 +2004,7 @@ sched_wakeup(struct thread *td) THREAD_LOCK_ASSERT(td, MA_OWNED); ts = td->td_sched; td->td_flags &= ~TDF_CANSWAP; +#if 0 /* * If we slept for more than a tick update our interactivity and * priority. @@ -2001,6 +2016,7 @@ sched_wakeup(struct thread *td) sched_interact_update(td); sched_pctcpu_update(ts, 0); } +#endif /* Reset the slice value after we sleep. */ ts->ts_slice = sched_slice; sched_add(td, SRQ_BORING); What I want is fairness between waiting on runqueue and waiting on sleepqueue. Supports you have N threads on runqueue: T1,T2,T3...Tn. and a thread T(n+1) on sleepqueue. If CPU runs threads T1...Tn in round-robin fashion, and suppose at time n, the thread Tn is run, this means total time of n-1 is passed, and at the time, thread T(n+1) is woken up, and scheduler's sched_interact_score() will give it higher priority over Tn, this is unfair because both threads have spent same total time to waiting for cpu. Do your patch fix the problem ? Regards, David Xu ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ULE patch, call for testers
on 05/11/2012 04:41 David Xu said the following: > Another problem I remembered is that a thread on runqueue may be starved > because ULE treats a sleeping thread and a thread waiting on runqueue > differently. If a thread has slept for a while, after it is woken up, > its priority is boosted, but for a thread on runqueue, its priority > will never be boosted. In essential, they should be same becase both of > them are waiting for cpu. If I am a thread, I'd like to wait on sleep > queue rather than on runqueue, since in former case, I will get > bonus, while in later case, I'll get nothing. Under heavy load, > there are many runnable threads, this unfair can cause a very low priority > thread on runqueue to be starved. 4BSD seems not suffer from > this problem, because it also decay cpu time of thread on runqueue. > I think ULE needs some anti-starvation code to give thread a shot > if it is waiting on runqueue too long time. I also noticed this issue and I've been playing with the following patch. Two points: o I am not sure if it is ideologically correct o it didn't improve much the behavior of my workloads In any case, here it is: - extend accounted interactive sleep time to a point where a thread runs (as opposed to be added to runq) --- a/sys/kern/sched_ule.c +++ b/sys/kern/sched_ule.c @@ -1898,8 +1899,21 @@ sched_switch(struct thread *td, struct thread *newtd, int flags) SDT_PROBE2(sched, , , off_cpu, td, td->td_proc); lock_profile_release_lock(&TDQ_LOCKPTR(tdq)->lock_object); TDQ_LOCKPTR(tdq)->mtx_lock = (uintptr_t)newtd; +#if 1 + /* +* If we slept for more than a tick update our interactivity and +* priority. +*/ + int slptick; + slptick = newtd->td_slptick; + newtd->td_slptick = 0; + if (slptick && slptick != ticks) { + newtd->td_sched->ts_slptime += + (ticks - slptick) << SCHED_TICK_SHIFT; + sched_interact_update(newtd); + } +#endif sched_pctcpu_update(newtd->td_sched, 0); - #ifdef KDTRACE_HOOKS /* * If DTrace has set the active vtime enum to anything @@ -1990,6 +2004,7 @@ sched_wakeup(struct thread *td) THREAD_LOCK_ASSERT(td, MA_OWNED); ts = td->td_sched; td->td_flags &= ~TDF_CANSWAP; +#if 0 /* * If we slept for more than a tick update our interactivity and * priority. @@ -2001,6 +2016,7 @@ sched_wakeup(struct thread *td) sched_interact_update(td); sched_pctcpu_update(ts, 0); } +#endif /* Reset the slice value after we sleep. */ ts->ts_slice = sched_slice; sched_add(td, SRQ_BORING); -- Andriy Gapon ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ULE patch, call for testers
On 2012/11/03 02:26, Jeff Roberson wrote: I have a small patch to the ULE scheduler that makes a fairly large change to the way timeshare threads are handled. http://people.freebsd.org/~jeff/schedslice.diff Previously ULE used a fixed slice size for all timeshare threads. Now it scales the slice size down based on load. This should reduce latency for timeshare threads as load increases. It is important to note that this does not impact interactive threads. But when a thread transitions to interactive from timeshare it should see some improvement. This happens when something like Xorg chews up a lot of CPU. If anyone has perf tests they'd like to run please report back. I have done a handful of validation. Thanks, Jeff Another problem I remembered is that a thread on runqueue may be starved because ULE treats a sleeping thread and a thread waiting on runqueue differently. If a thread has slept for a while, after it is woken up, its priority is boosted, but for a thread on runqueue, its priority will never be boosted. In essential, they should be same becase both of them are waiting for cpu. If I am a thread, I'd like to wait on sleep queue rather than on runqueue, since in former case, I will get bonus, while in later case, I'll get nothing. Under heavy load, there are many runnable threads, this unfair can cause a very low priority thread on runqueue to be starved. 4BSD seems not suffer from this problem, because it also decay cpu time of thread on runqueue. I think ULE needs some anti-starvation code to give thread a shot if it is waiting on runqueue too long time. Regards, David Xu ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ULE patch, call for testers
On Fri, 2 Nov 2012, Eitan Adler wrote: On 2 November 2012 14:26, Jeff Roberson wrote: I have a small patch to the ULE scheduler that makes a fairly large change to the way timeshare threads are handled. http://people.freebsd.org/~jeff/schedslice.diff Previously ULE used a fixed slice size for all timeshare threads. Now it scales the slice size down based on load. This should reduce latency for timeshare threads as load increases. It is important to note that this does not impact interactive threads. But when a thread transitions to interactive from timeshare it should see some improvement. This happens when something like Xorg chews up a lot of CPU. If anyone has perf tests they'd like to run please report back. I have done a handful of validation. does it make sense to make these sysctls? +#defineSCHED_SLICE_DEFAULT_DIVISOR 10 /* 100 ms. */ +#defineSCHED_SLICE_MIN_DIVISOR 4 /* DEFAULT/MIN = 25 ms. */ DEFAULT_DIVISOR is indirectly through the sysctls that modify the slice. The min divisor could be. I will consider adding that. Thanks, Jeff -- Eitan Adler ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: ULE patch, call for testers
On 2 November 2012 14:26, Jeff Roberson wrote: > I have a small patch to the ULE scheduler that makes a fairly large change > to the way timeshare threads are handled. > > http://people.freebsd.org/~jeff/schedslice.diff > > Previously ULE used a fixed slice size for all timeshare threads. Now it > scales the slice size down based on load. This should reduce latency for > timeshare threads as load increases. It is important to note that this does > not impact interactive threads. But when a thread transitions to > interactive from timeshare it should see some improvement. This happens > when something like Xorg chews up a lot of CPU. > > If anyone has perf tests they'd like to run please report back. I have done > a handful of validation. does it make sense to make these sysctls? +#defineSCHED_SLICE_DEFAULT_DIVISOR 10 /* 100 ms. */ +#defineSCHED_SLICE_MIN_DIVISOR 4 /* DEFAULT/MIN = 25 ms. */ -- Eitan Adler ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
ULE patch, call for testers
I have a small patch to the ULE scheduler that makes a fairly large change to the way timeshare threads are handled. http://people.freebsd.org/~jeff/schedslice.diff Previously ULE used a fixed slice size for all timeshare threads. Now it scales the slice size down based on load. This should reduce latency for timeshare threads as load increases. It is important to note that this does not impact interactive threads. But when a thread transitions to interactive from timeshare it should see some improvement. This happens when something like Xorg chews up a lot of CPU. If anyone has perf tests they'd like to run please report back. I have done a handful of validation. Thanks, Jeff ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"