From: Aaro Koskinen <aaro.koski...@nokia.com>

There is a narrow window where the SVC kthread may go to sleep with the
maximum (infinite) timeout, missing the wakeup/shutdown from the client
and making the client hang:

        Client process                  SVC kthread
        ==============                  ===========

stratix10_svc_done()
                                while (!kthread_should_stop())
   kthread_stop();
                                   ret_fifo = kfifo_out_spinlocked();
      wake_up_process();
                                   /* kthread is already running. */
      wait_for_completion();
                                   if (!ret_fifo)

                                      /* kthread going to sleep and nobody
                                       * will wake it up unless there is a
                                       * timeout. */
                                      schedule_timeout_interruptible();

      /* Client waits for the
       * kthread to wake up and
       * stop. */

The race window is quite narrow, so in normal use the hang is difficult
to reproduce. The following artificial method was used to trigger a hang
with stratix01-rsu driver and write to "reboot_image":

        - Create 100% background CPU load (e.g. "while :; do true; done &"
          multiple times).

        - Insert busy-looping mdelay(1000) to the kernel thread just before
          schedule_timeout_interruptible(). This does not change the program
          logic, just timing.

        - Now write to "reboot_image", it should hang instantly.

        - Examining stack traces, the client process is shown as stuck in
          kthread_stop() and kthread remains sleeping and scheduled out as
          predicted:

        # cat /proc/493/stack
        [<0>] __switch_to+0xe0/0x15c
        [<0>] kthread_stop+0x9c/0x270
        [<0>] stratix10_svc_done+0x58/0xd0
        [<0>] rsu_send_msg+0xa0/0x120
        [<0>] reboot_image_store+0x9c/0xe0
        [<0>] dev_attr_store+0x24/0x40
        [<0>] sysfs_kf_write+0x50/0x60
        [<0>] kernfs_fop_write_iter+0x124/0x1b4
        [<0>] new_sync_write+0xf0/0x190
        [<0>] vfs_write+0x21c/0x280
        [<0>] ksys_write+0x74/0x100
        [<0>] __arm64_sys_write+0x28/0x3c
        [<0>] el0_svc_common.constprop.0+0x9c/0x210
        [<0>] do_el0_svc+0x78/0xa0
        [<0>] el0_svc+0x20/0x30
        [<0>] el0_sync_handler+0x1a4/0x1b0
        [<0>] el0_sync+0x180/0x1c0

        # cat /proc/494/stack
        [<0>] __switch_to+0xe0/0x15c
        [<0>] svc_normal_to_secure_thread+0x5d8/0x1430
        [<0>] kthread+0x150/0x160
        [<0>] ret_from_fork+0x10/0x3c

As a workaround, make the kthread to poll for stopped status once a
second instead going to an infinite sleep.

Upstream-Status: Pending
Signed-off-by: Aaro Koskinen <aaro.koski...@nokia.com>
[patch provided by Nokia directly]
Signed-off-by: Liwei Song <liwei.s...@windriver.com>
---
 drivers/firmware/stratix10-svc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/firmware/stratix10-svc.c b/drivers/firmware/stratix10-svc.c
index 65a566cba99a..2aae36906616 100644
--- a/drivers/firmware/stratix10-svc.c
+++ b/drivers/firmware/stratix10-svc.c
@@ -552,7 +552,7 @@ static int svc_normal_to_secure_thread(void *data)
                                        &chan->svc_fifo_lock);
 
                if (!ret_fifo) {
-                       schedule_timeout_interruptible(MAX_SCHEDULE_TIMEOUT);
+                       schedule_timeout_interruptible(HZ);
                        continue;
                }
 
-- 
2.35.5

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#13091): 
https://lists.yoctoproject.org/g/linux-yocto/message/13091
Mute This Topic: https://lists.yoctoproject.org/mt/101473528/21656
Group Owner: linux-yocto+ow...@lists.yoctoproject.org
Unsubscribe: https://lists.yoctoproject.org/g/linux-yocto/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to