Re: [Devel] [PATCH v2 vz8] kernel/sched/fair.c: Add missing update_rq_clock() calls
On 29.09.2020 15:07, Andrey Ryabinin wrote: > > > On 9/29/20 11:24 AM, Kirill Tkhai wrote: >> On 28.09.2020 15:03, Andrey Ryabinin wrote: >>> We've got a hard lockup which seems to be caused by mgag200 >>> console printk code calling to schedule_work from scheduler >>> with rq->lock held: >>> #5 [b79e034239a8] native_queued_spin_lock_slowpath at 8b50c6c6 >>> #6 [b79e034239a8] _raw_spin_lock at 8bc96e5c >>> #7 [b79e034239b0] try_to_wake_up at 8b4e26ff >>> #8 [b79e03423a10] __queue_work at 8b4ce3f3 >>> #9 [b79e03423a58] queue_work_on at 8b4ce714 >>> #10 [b79e03423a68] mga_imageblit at c026d666 [mgag200] >>> #11 [b79e03423a80] soft_cursor at 8b8a9d84 >>> #12 [b79e03423ad8] bit_cursor at 8b8a99b2 >>> #13 [b79e03423ba0] hide_cursor at 8b93bc7a >>> #14 [b79e03423bb0] vt_console_print at 8b93e07d >>> #15 [b79e03423c18] console_unlock at 8b518f0e >>> #16 [b79e03423c68] vprintk_emit_log at 8b51acf7 >>> #17 [b79e03423cc0] vprintk_default at 8b51adcd >>> #18 [b79e03423cd0] printk at 8b51b3d6 >>> #19 [b79e03423d30] __warn_printk at 8b4b13a0 >>> #20 [b79e03423d98] assert_clock_updated at 8b4dd293 >>> #21 [b79e03423da0] deactivate_task at 8b4e12d1 >>> #22 [b79e03423dc8] move_task_group at 8b4eaa5b >>> #23 [b79e03423e00] cpulimit_balance_cpu_stop at 8b4f02f3 >>> #24 [b79e03423eb0] cpu_stopper_thread at 8b576b67 >>> #25 [b79e03423ee8] smpboot_thread_fn at 8b4d9125 >>> #26 [b79e03423f10] kthread at 8b4d4fc2 >>> #27 [b79e03423f50] ret_from_fork at 8be00255 >>> >>> The printk called because assert_clock_updated() triggered >>> SCHED_WARN_ON(rq->clock_update_flags < RQCF_ACT_SKIP); >>> >>> This means that we missing necessary update_rq_clock() call. >>> Add one to cpulimit_balance_cpu_stop() to fix the warning. >>> Also add one in load_balance() before move_task_groups() call. >>> It seems to be another place missing this call. >>> >>> https://jira.sw.ru/browse/PSBM-108013 >>> Signed-off-by: Andrey Ryabinin >>> --- >>> kernel/sched/fair.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>> index 5d3556b15e70..e6dc21d5fa03 100644 >>> --- a/kernel/sched/fair.c >>> +++ b/kernel/sched/fair.c >>> @@ -7816,6 +7816,7 @@ static int cpulimit_balance_cpu_stop(void *data) >>> >>> schedstat_inc(sd->clb_count); >>> >>> + update_rq_clock(rq); >> >> Shouldn't we also add the same for target_rq to avoid WARN() coming from >> attach_task()? >> > > It seems like we should. Are you going to send v3 or patch on top of this? ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
Re: [Devel] [PATCH v2 vz8] kernel/sched/fair.c: Add missing update_rq_clock() calls
On 9/29/20 11:24 AM, Kirill Tkhai wrote: > On 28.09.2020 15:03, Andrey Ryabinin wrote: >> We've got a hard lockup which seems to be caused by mgag200 >> console printk code calling to schedule_work from scheduler >> with rq->lock held: >> #5 [b79e034239a8] native_queued_spin_lock_slowpath at 8b50c6c6 >> #6 [b79e034239a8] _raw_spin_lock at 8bc96e5c >> #7 [b79e034239b0] try_to_wake_up at 8b4e26ff >> #8 [b79e03423a10] __queue_work at 8b4ce3f3 >> #9 [b79e03423a58] queue_work_on at 8b4ce714 >> #10 [b79e03423a68] mga_imageblit at c026d666 [mgag200] >> #11 [b79e03423a80] soft_cursor at 8b8a9d84 >> #12 [b79e03423ad8] bit_cursor at 8b8a99b2 >> #13 [b79e03423ba0] hide_cursor at 8b93bc7a >> #14 [b79e03423bb0] vt_console_print at 8b93e07d >> #15 [b79e03423c18] console_unlock at 8b518f0e >> #16 [b79e03423c68] vprintk_emit_log at 8b51acf7 >> #17 [b79e03423cc0] vprintk_default at 8b51adcd >> #18 [b79e03423cd0] printk at 8b51b3d6 >> #19 [b79e03423d30] __warn_printk at 8b4b13a0 >> #20 [b79e03423d98] assert_clock_updated at 8b4dd293 >> #21 [b79e03423da0] deactivate_task at 8b4e12d1 >> #22 [b79e03423dc8] move_task_group at 8b4eaa5b >> #23 [b79e03423e00] cpulimit_balance_cpu_stop at 8b4f02f3 >> #24 [b79e03423eb0] cpu_stopper_thread at 8b576b67 >> #25 [b79e03423ee8] smpboot_thread_fn at 8b4d9125 >> #26 [b79e03423f10] kthread at 8b4d4fc2 >> #27 [b79e03423f50] ret_from_fork at 8be00255 >> >> The printk called because assert_clock_updated() triggered >> SCHED_WARN_ON(rq->clock_update_flags < RQCF_ACT_SKIP); >> >> This means that we missing necessary update_rq_clock() call. >> Add one to cpulimit_balance_cpu_stop() to fix the warning. >> Also add one in load_balance() before move_task_groups() call. >> It seems to be another place missing this call. >> >> https://jira.sw.ru/browse/PSBM-108013 >> Signed-off-by: Andrey Ryabinin >> --- >> kernel/sched/fair.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 5d3556b15e70..e6dc21d5fa03 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -7816,6 +7816,7 @@ static int cpulimit_balance_cpu_stop(void *data) >> >> schedstat_inc(sd->clb_count); >> >> +update_rq_clock(rq); > > Shouldn't we also add the same for target_rq to avoid WARN() coming from > attach_task()? > It seems like we should. ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
Re: [Devel] [PATCH v2 vz8] kernel/sched/fair.c: Add missing update_rq_clock() calls
On 28.09.2020 15:03, Andrey Ryabinin wrote: > We've got a hard lockup which seems to be caused by mgag200 > console printk code calling to schedule_work from scheduler > with rq->lock held: > #5 [b79e034239a8] native_queued_spin_lock_slowpath at 8b50c6c6 > #6 [b79e034239a8] _raw_spin_lock at 8bc96e5c > #7 [b79e034239b0] try_to_wake_up at 8b4e26ff > #8 [b79e03423a10] __queue_work at 8b4ce3f3 > #9 [b79e03423a58] queue_work_on at 8b4ce714 > #10 [b79e03423a68] mga_imageblit at c026d666 [mgag200] > #11 [b79e03423a80] soft_cursor at 8b8a9d84 > #12 [b79e03423ad8] bit_cursor at 8b8a99b2 > #13 [b79e03423ba0] hide_cursor at 8b93bc7a > #14 [b79e03423bb0] vt_console_print at 8b93e07d > #15 [b79e03423c18] console_unlock at 8b518f0e > #16 [b79e03423c68] vprintk_emit_log at 8b51acf7 > #17 [b79e03423cc0] vprintk_default at 8b51adcd > #18 [b79e03423cd0] printk at 8b51b3d6 > #19 [b79e03423d30] __warn_printk at 8b4b13a0 > #20 [b79e03423d98] assert_clock_updated at 8b4dd293 > #21 [b79e03423da0] deactivate_task at 8b4e12d1 > #22 [b79e03423dc8] move_task_group at 8b4eaa5b > #23 [b79e03423e00] cpulimit_balance_cpu_stop at 8b4f02f3 > #24 [b79e03423eb0] cpu_stopper_thread at 8b576b67 > #25 [b79e03423ee8] smpboot_thread_fn at 8b4d9125 > #26 [b79e03423f10] kthread at 8b4d4fc2 > #27 [b79e03423f50] ret_from_fork at 8be00255 > > The printk called because assert_clock_updated() triggered > SCHED_WARN_ON(rq->clock_update_flags < RQCF_ACT_SKIP); > > This means that we missing necessary update_rq_clock() call. > Add one to cpulimit_balance_cpu_stop() to fix the warning. > Also add one in load_balance() before move_task_groups() call. > It seems to be another place missing this call. > > https://jira.sw.ru/browse/PSBM-108013 > Signed-off-by: Andrey Ryabinin > --- > kernel/sched/fair.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 5d3556b15e70..e6dc21d5fa03 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -7816,6 +7816,7 @@ static int cpulimit_balance_cpu_stop(void *data) > > schedstat_inc(sd->clb_count); > > + update_rq_clock(rq); Shouldn't we also add the same for target_rq to avoid WARN() coming from attach_task()? > if (do_cpulimit_balance()) > schedstat_inc(sd->clb_pushed); > else > @@ -9176,6 +9177,7 @@ static int load_balance(int this_cpu, struct rq > *this_rq, > env.loop = 0; > local_irq_save(rf.flags); > double_rq_lock(env.dst_rq, busiest); > + update_rq_clock(env.dst_rq); > cur_ld_moved = ld_moved = move_task_groups(); > double_rq_unlock(env.dst_rq, busiest); > local_irq_restore(rf.flags); > ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel
[Devel] [PATCH v2 vz8] kernel/sched/fair.c: Add missing update_rq_clock() calls
We've got a hard lockup which seems to be caused by mgag200 console printk code calling to schedule_work from scheduler with rq->lock held: #5 [b79e034239a8] native_queued_spin_lock_slowpath at 8b50c6c6 #6 [b79e034239a8] _raw_spin_lock at 8bc96e5c #7 [b79e034239b0] try_to_wake_up at 8b4e26ff #8 [b79e03423a10] __queue_work at 8b4ce3f3 #9 [b79e03423a58] queue_work_on at 8b4ce714 #10 [b79e03423a68] mga_imageblit at c026d666 [mgag200] #11 [b79e03423a80] soft_cursor at 8b8a9d84 #12 [b79e03423ad8] bit_cursor at 8b8a99b2 #13 [b79e03423ba0] hide_cursor at 8b93bc7a #14 [b79e03423bb0] vt_console_print at 8b93e07d #15 [b79e03423c18] console_unlock at 8b518f0e #16 [b79e03423c68] vprintk_emit_log at 8b51acf7 #17 [b79e03423cc0] vprintk_default at 8b51adcd #18 [b79e03423cd0] printk at 8b51b3d6 #19 [b79e03423d30] __warn_printk at 8b4b13a0 #20 [b79e03423d98] assert_clock_updated at 8b4dd293 #21 [b79e03423da0] deactivate_task at 8b4e12d1 #22 [b79e03423dc8] move_task_group at 8b4eaa5b #23 [b79e03423e00] cpulimit_balance_cpu_stop at 8b4f02f3 #24 [b79e03423eb0] cpu_stopper_thread at 8b576b67 #25 [b79e03423ee8] smpboot_thread_fn at 8b4d9125 #26 [b79e03423f10] kthread at 8b4d4fc2 #27 [b79e03423f50] ret_from_fork at 8be00255 The printk called because assert_clock_updated() triggered SCHED_WARN_ON(rq->clock_update_flags < RQCF_ACT_SKIP); This means that we missing necessary update_rq_clock() call. Add one to cpulimit_balance_cpu_stop() to fix the warning. Also add one in load_balance() before move_task_groups() call. It seems to be another place missing this call. https://jira.sw.ru/browse/PSBM-108013 Signed-off-by: Andrey Ryabinin --- kernel/sched/fair.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 5d3556b15e70..e6dc21d5fa03 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7816,6 +7816,7 @@ static int cpulimit_balance_cpu_stop(void *data) schedstat_inc(sd->clb_count); + update_rq_clock(rq); if (do_cpulimit_balance()) schedstat_inc(sd->clb_pushed); else @@ -9176,6 +9177,7 @@ static int load_balance(int this_cpu, struct rq *this_rq, env.loop = 0; local_irq_save(rf.flags); double_rq_lock(env.dst_rq, busiest); + update_rq_clock(env.dst_rq); cur_ld_moved = ld_moved = move_task_groups(); double_rq_unlock(env.dst_rq, busiest); local_irq_restore(rf.flags); -- 2.26.2 ___ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel