This is a note to let you know that I've just added the patch titled

    xen: don't reschedule in preemption off sections

to the 4.14-stable tree which can be found at:
    
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     xen-don-t-reschedule-in-preemption-off-sections.patch
and it can be found in the queue-4.14 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <sta...@vger.kernel.org> know about it.


>From jgr...@suse.com  Mon Aug 24 10:12:00 2020
From: Juergen Gross <jgr...@suse.com>
Date: Thu, 20 Aug 2020 08:59:08 +0200
Subject: xen: don't reschedule in preemption off sections
To: sta...@vger.kernel.org
Cc: xen-devel@lists.xenproject.org, Juergen Gross <jgr...@suse.com>, Boris 
Ostrovsky <boris.ostrov...@oracle.com>, Stefano Stabellini 
<sstabell...@kernel.org>, Sarah Newman <s...@prgmr.com>, Chris Brannon 
<c...@prgmr.com>
Message-ID: <20200820065908.20592-1-jgr...@suse.com>

From: Juergen Gross <jgr...@suse.com>

For support of long running hypercalls xen_maybe_preempt_hcall() is
calling cond_resched() in case a hypercall marked as preemptible has
been interrupted.

Normally this is no problem, as only hypercalls done via some ioctl()s
are marked to be preemptible. In rare cases when during such a
preemptible hypercall an interrupt occurs and any softirq action is
started from irq_exit(), a further hypercall issued by the softirq
handler will be regarded to be preemptible, too. This might lead to
rescheduling in spite of the softirq handler potentially having set
preempt_disable(), leading to splats like:

BUG: sleeping function called from invalid context at drivers/xen/preempt.c:37
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 20775, name: xl
INFO: lockdep is turned off.
CPU: 1 PID: 20775 Comm: xl Tainted: G D W 5.4.46-1_prgmr_debug.el7.x86_64 #1
Call Trace:
<IRQ>
dump_stack+0x8f/0xd0
___might_sleep.cold.76+0xb2/0x103
xen_maybe_preempt_hcall+0x48/0x70
xen_do_hypervisor_callback+0x37/0x40
RIP: e030:xen_hypercall_xen_version+0xa/0x20
Code: ...
RSP: e02b:ffffc900400dcc30 EFLAGS: 00000246
RAX: 000000000004000d RBX: 0000000000000200 RCX: ffffffff8100122a
RDX: ffff88812e788000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffffff83ee3ad0 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000246 R12: ffff8881824aa0b0
R13: 0000000865496000 R14: 0000000865496000 R15: ffff88815d040000
? xen_hypercall_xen_version+0xa/0x20
? xen_force_evtchn_callback+0x9/0x10
? check_events+0x12/0x20
? xen_restore_fl_direct+0x1f/0x20
? _raw_spin_unlock_irqrestore+0x53/0x60
? debug_dma_sync_single_for_cpu+0x91/0xc0
? _raw_spin_unlock_irqrestore+0x53/0x60
? xen_swiotlb_sync_single_for_cpu+0x3d/0x140
? mlx4_en_process_rx_cq+0x6b6/0x1110 [mlx4_en]
? mlx4_en_poll_rx_cq+0x64/0x100 [mlx4_en]
? net_rx_action+0x151/0x4a0
? __do_softirq+0xed/0x55b
? irq_exit+0xea/0x100
? xen_evtchn_do_upcall+0x2c/0x40
? xen_do_hypervisor_callback+0x29/0x40
</IRQ>
? xen_hypercall_domctl+0xa/0x20
? xen_hypercall_domctl+0x8/0x20
? privcmd_ioctl+0x221/0x990 [xen_privcmd]
? do_vfs_ioctl+0xa5/0x6f0
? ksys_ioctl+0x60/0x90
? trace_hardirqs_off_thunk+0x1a/0x20
? __x64_sys_ioctl+0x16/0x20
? do_syscall_64+0x62/0x250
? entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fix that by testing preempt_count() before calling cond_resched().

In kernel 5.8 this can't happen any more due to the entry code rework
(more than 100 patches, so not a candidate for backporting).

The issue was introduced in kernel 4.3, so this patch should go into
all stable kernels in [4.3 ... 5.7].

Reported-by: Sarah Newman <s...@prgmr.com>
Fixes: 0fa2f5cb2b0ecd8 ("sched/preempt, xen: Use need_resched() instead of 
should_resched()")
Cc: Sarah Newman <s...@prgmr.com>
Cc: sta...@vger.kernel.org
Signed-off-by: Juergen Gross <jgr...@suse.com>
Tested-by: Chris Brannon <c...@prgmr.com>
Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
---
 drivers/xen/preempt.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/xen/preempt.c
+++ b/drivers/xen/preempt.c
@@ -31,7 +31,7 @@ EXPORT_SYMBOL_GPL(xen_in_preemptible_hca
 asmlinkage __visible void xen_maybe_preempt_hcall(void)
 {
        if (unlikely(__this_cpu_read(xen_in_preemptible_hcall)
-                    && need_resched())) {
+                    && need_resched() && !preempt_count())) {
                /*
                 * Clear flag as we may be rescheduled on a different
                 * cpu.


Patches currently in stable-queue which might be from jgr...@suse.com are

queue-4.14/xen-don-t-reschedule-in-preemption-off-sections.patch

Reply via email to