On 2/28/2022 12:19, Sebastian Huber wrote:
On 26/02/2022 08:03, Kinsey Moore wrote:
On 2/26/2022 00:53, Sebastian Huber wrote:
On 26/02/2022 00:41, Kinsey Moore wrote:
This may also be an issue for ARM, RISC-V and others as it doesn't appear that ARM saves CPSR during context switch and I couldn't tell that RISC-V does this either, though I'm less familiar with it.

This doesn't look like the right way to fix this issue.

There is currently the assumption that all processors start multitasking with a context switch to _Thread_Handler() which sets the interrupt level. It is possible to construct a scenario in which we start multitasking with a migration of a thread which already executed the _Thread_Handler() prologue. This would result in an execution with disabled interrupts. I think the proper fix for this scenario is to enable interrupts in _CPU_SMP_Prepare_start_multitasking().

Doing a context switch with interrupts disabled is a fatal application error on all architectures with

#define CPU_ENABLE_ROBUST_THREAD_DISPATCH TRUE

or enabled SMP support.

Ok, great. I was wondering if that was the case and this is definitely the kind of feedback I was looking for. I'll adjust the patch set to reflect that. I still wonder if this is an issue on other SMP CPU ports, though, since most of them don't implement that hook, either.

I would like to have a closer look at this next week then I am back from holidays.

Enabling interrupts in _CPU_SMP_Prepare_start_multitasking() would not work since we use the interrupt stack at this point. We should add a ticket and a test case for this (I can do this next week). How did you observe this bug?

I was only able to observe this bug once the 2/2 patch is applied and that optimization opens a race condition (adding a few no-ops to the Per_CPU_Control accessor prevents it from appearing) in the sppercpudata01 test on SMP configurations since the task is migrating across CPUs as CPUs are coming online. The race condition resolves nominally in 90% of cases so while it's not a frequent failure it is reproducible.


Kinsey

_______________________________________________
devel mailing list
devel@rtems.org
http://lists.rtems.org/mailman/listinfo/devel

Reply via email to